Dr. Jörg Zimmermann

Address Institute for Computer Science - University of Bonn
Endenicher Allee 19A
53115 Bonn, Germany
Office Room 3.036
Phone +49 228 73-60938
Fax +49 228 73-4212
Email jz@cs.uni-bonn.de


I am a postdoctoral researcher at the Institute for Computer Science, University of Bonn and a member of the Artificial Intelligence Foundations Group (AIF). I have worked on an EU project using evolutionary optimization to design real-world radio networks, and currently I am teaching courses on applying methods from statistics and machine learning to the analysis of biological data. The plethora of methods and approaches used in machine learning and optimization has inspired me to search for foundations and unifying principles of these fields.



Research Interests & Goals


Nothing is more important than to see the sources of invention, which are, in my opinion, more interesting than the inventions themselves.

Gottfried Wilhelm Leibniz


Axiomatic Intelligence Theory

Currently my research focuses on foundations of artificial intelligence, aiming to establish a sound theoretical framework for concepts like intelligence, learning, and self-improvement. Ultimately, this should result in a set of axioms for "intelligence theory", much like we have axiomatic foundations for set theory (ZFC) or probability theory (Kolmogorov axioms) today.

Intelligence through Learning and Self-Improvement

The envisioned intelligence theory should be a theory whose main goal is not to describe, characterize, or implement a developed mind, but to find the laws of cognitive dynamics which will lead to the emergence of mind. It will treat intelligence as an inherently dynamic phenomenon, not only with regard to its interactions with the outside world, but also with regard to its internal organization. This outlook on intelligence theory can be seen in analogy to the development of physics, which has evolved via the stages of statics and kinematics into a dynamic theory of physical phenomena. In this sense, intelligence theory aims to discover the laws of evolution of mind instead of describing a snapshot of a developed mind. The Gödel machine introduced by J. Schmidhuber already captures this spirit very well. A scalable version of the Gödel machine, the Gödel agent, was presented at the conference on artificial general intelligence (AGI 2015). Eventually, this should lead to a set of rules describing a universal intelligence dynamics, a dynamics which will send any seed mind on an ascending trajectory of ever higher levels of cognition, intelligence, and insight.

Representation and Processing of Uncertainty

One step in the direction of axiomatic foundations was a thorough analysis of the concept of uncertainty, which has led to an algebraic approach to characterize uncertainty calculi. This topic, algebraic uncertainty theory, was developed in my PhD thesis. Algebraic uncertainty theory enables a unifying perspective on reasoning under uncertainty by deriving, and not defining, the structure of uncertainty values - it is not a YAUC (yet another uncertainty calculus). Confidence theory, the theory resulting from the proposed axiom system NC12, subsumes probability theory and Dempster-Shafer theory and can solve longstanding problems like combining coherent conditionalization and a resolution of the Ellsberg paradox (see the slides of my defense talk). To my knowledge, no other current uncertainty calculus is able to do this. How to further develop and apply confidence theory and how to integrate it with other lines of research is the topic of ongoing investigations.

Effective Universal Induction

Another step in this endeavor was the analysis of the incomputability of universal induction, i.e., using the whole program space as possible models, as it arises in the framework introduced by R. Solomonoff in 1964. Changing this framework by embedding the agent and the environment into the same time structure, a synchronous agent framework, the incompatibility of universality and effectivity vanishes. This is outlined in an article presented at the Turing Centenary Conference in 2012.

Efficient Universal Induction

After having found that universal induction can be made effective (effective just means computable in computer science), the focus has shifted to making it efficient. For this, one has to tackle the problem of a reference machine, i.e., a typical machine available for implementing universal induction, in detail. A first line of attack to this reference machine problem was the development of a "machine theory", for which a core axiomatic system was introduced at the MCU 2013. The goal here is to define a standard reference machine from first principles, or at least to reduce the contingent aspects of such a reference machine to a minimum.

Measuring the Relevant Complexity

Another important topic in order to achieve efficiency of universal induction is the initial complexity of algorithms, not their asymptotic one. Initial complexity deals with the resource requirements of an algorithm on the actually occurring inputs. Sometimes the asymptotic behavior reflects the initial behavior, but it can be misleading, and it is a working hypothesis of mine that in the context of artificial intelligence this divergence of initial and asymptotic behavior is the typical, not the exceptional case. There could be huge improvements of the initial behavior of an algorithm which would not count at all from the asymptotic complexity perspective. So one goal of future research is to look for algorithmic improvements within the initial complexity paradigm.

Robustness as a Basic Design Principle

As a cross-cutting topic, I'm strongly interested in all kinds of robustness analyses in order to monitor, control, and communicate remaining contingent aspects of design decisions for reference machines, induction systems (especially priors in Bayesian inference), and agent policies. If one cannot eliminate all contingent aspects of a system, one should at least make transparent how changes in the contingent aspects propagate through a system and how they affect conclusions or actions drawn or taken by a system. This often boils down to the following question: if one perturbs a part A of a system S by ε, how can one bound the effect of this perturbation on another part B of system S, i.e., finding a function f so that the perturbation of part B can be bound by f(ε). Of course, in order to perform such a robustness analysis, one needs meaningful metric structures for the different parts of a system, and that was one reason to propose a candidate for a metric on machine space in the MCU article.

Motivation

Finally, this research is motivated not only by sheer curiosity, but also by the conviction that intelligent tools can advance our civilization like mechanical tools have done in the past. Of course, powerful tools do not only create opportunities, but also risks. However, given the state of our world, relinquishment is not an option. We have to develop strategies which keep the risks of these tools in check, while reaping their potential to find smart solutions for the problems that currently plague our planet.



Teaching

Publications

PhD Thesis

AGI 2015 Short Sequence Prediction Challenge





AGI 2015 Short Sequence Prediction Challenge


In times of Big Data here is a small data problem: What is your (or your favorite prediction system's) guess for the twelfth bit?

0 0 0 1 0 0 0 1 0 0 0 ?

Your (or your system's) guess could also be a probability or a probability interval that the twelfth bit is 1.

If you think the sequence is too short, then what is long enough? (Hint: Most people I have asked have a quick and strong opinion on the twelfth bit :)

There is not a "correct" answer, at least I do not know it (yet). But the AGI community, deeply concerned with prediction and universal induction, should at least strive for a consensus which answers the question for the twelfth bit. Maybe this results in the definition of a standard reference machine, which, in my opinion, would be an important stepping stone for future developments in universal induction and AGI.

Guesses, probabilities (probability intervals), and comments can be sent to jz@bit.uni-bonn.de.





Teaching





    Summer Lecture 2015: Analysis of Microarray Data with Methods from Machine Learning and Network Theory


    Labcourse WS 2014/15: Programming Machine Learning Methods for Microarray Data Analysis

  • Task 1: Noise Features
  • Task 2: Synergy Score
  • Task 3: Bootstrapping
  • Task 4: Algorithmic Efficiency

    Labcourse 2014: Microarray Analysis with R

  • Week 2, Part 1: Classification
  • Week 2, Part 2: Lasso Method (PPI Network Mapping is optional)
  • Week 2, Part 3: Network Analysis and Clustering

    Seminar 2014: Inference and Design Principles of Biological Networks


    Summer Lecture 2014: Analysis of Microarray Data with Methods from Machine Learning and Network Theory


    Labcourse WS 2013/14: Programming Machine Learning Methods for Microarray Data Analysis




Publications


PhD Thesis

  • Jörg Zimmermann: Algebraic Uncertainty Theory - A Unifying Perspective on Reasoning under Uncertainty, University of Bonn, 2013 [PDF]

    Thesis Defense Presentation [Slides]



    Abstract:

    The question of how to represent and process uncertainty is of fundamental importance to the scientific process, but also in everyday life. Currently there exist a lot of different calculi for managing uncertainty, each having its own advantages and disadvantages. Especially, almost all are defining the domain and structure of uncertainty values a priori, e.g., one real number, two real numbers, a finite domain, and so on, but maybe uncertainty is best measured by complex numbers, matrices or still another mathematical structure. This thesis investigates the notion of uncertainty from a foundational point of view, provides an ontology and axiomatic core system for uncertainty and derives and not defines the structure of uncertainty. The main result, the ring theorem, stating that uncertainty values are elements of the [0,1]-interval of a partially ordered ring, is used to derive a general decomposition theorem for uncertainty values, splitting them into a numerical interval and an "interaction term". In order to illustrate the unifying power of these results, the relationship to Dempster-Shafer theory is discussed and it is shown that all Dempster-Shafer measures over finite domains can be represented by ring-valued uncertainty measures. Finally, the historical development of approaches to modeling uncertainty which have led to the results of this thesis is reviewed.