Skip Navigation



The Computer Journal Advance Access published online on February 8, 2008

The Computer Journal, doi:10.1093/comjnl/bxm120
This Article
Right arrow Full Text
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Google Scholar
Right arrow Articles by Solomonoff, R. J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2007. Published by Oxford University Press on behalf of The British Computer Society. All rights reserved. For Permissions, please email: journals.permissions@oxfordjournals.org

Three Kinds of Probabilistic Induction: Universal Distributions and Convergence Theorems

Ray J. Solomonoff1,2,*

1 Computer Learning Research Centre Royal Holloway, University of London, London, UK
2 IDSIA, Galleria 2, CH-6928 Manno-Lugano, Switzerland

* Corresponding author: rjsolo{at}ieee.org http://world.std.com/~rjs/pubs.html

We will describe three kinds of probabilistic induction problems, and give general solutions for each, with associated convergence theorems which show that they tend to give good probability estimates. The first kind extrapolates a sequence of strings and/or numbers. The second extrapolates an unordered set of strings and/or numbers. The third extrapolates an unordered set of ordered pairs of elements that may be strings and/or numbers. Given the first element of a new pair, to get a probability distribution on possible second elements of the pair. Each of the three kinds of problems is solved using an associated universal distribution. In each case a corresponding convergence theorem is given, showing that as sample size grows, the expected error in probability estimate decreases rapidly. The solutions given are very general and cover a great variety of induction problems. Time series prediction, grammar discovery (for both formal and natural languages), curve fitting, the identification problem and the categorization problem, are a few of the kinds of problems amenable to the methods described.

Key Words: algorithmic Probability • universal probability distribution • machine learning • statistical learning theory • classification • identification problem • curve fitting • prediction • regression • grammatical induction • time series prediction • proof of convergence


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?




Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.