Material Detail

Experiment Databases for Machine Learning / BenchMarking Via Weka

Experiment Databases for Machine Learning / BenchMarking Via Weka

This video was recorded at NIPS Workshop on Machine Learning Open Source Software, Whistler 2008. Experiment Databases for Machine Learning Experiment Databases for Machine Learning is a large public repository of machine learning experiments as well as a framework for producing similar databases for specific goals. This projects aims to bring the infor- mation contained in many machine learning experiments together and organize it a way that allows everyone to investigate how learning algorithms have performed in previous studies. To share such information with the world, a common language is proposed, dubbed ExpML, capturing the basic structure of a large range of machine learning experiments while remaining open for future extensions. This language also enforces reproducibility by requiring links to the used datasets and algorithms and by storing all details of the ex- periment setup. All stored information can then be accessed by querying the database, creating a powerful way to collect and reorganize the data, thus warranting a very thorough examination of the stored results. The current publicly available database contains over 500,000 classification and regression experiments, and has both an online interface, at, as well as a stand-alone explorer tool offering various visualization techniques. This framework can also be integrated in machine learning toolboxes to automatically stream results to a global (or local) experiment database, or to download experiments that have been run before. BenchMarking Via Weka BenchMarking Via Weka is a client-server architecture that supports interoperability between dierent machine learning systems. Machine learning systems need to provide mechanisms for processing data and evaluating generated models. In our system, the server hosts all the data and performs all the statistical analyses, while the client performs all the pre-processing and model building. This separation of tasks opens up the possibility of oering a cross-platform and cross-language framework. By performing statistical analyses on the host, we avoid unnecessary exchange and conversion of generated results.


  • User Rating
  • Comments
  • Learning Exercises
  • Bookmark Collections
  • Course ePortfolios
  • Accessibility Info

More about this material


Disciplines with similar materials as Experiment Databases for Machine Learning / BenchMarking Via Weka


Log in to participate in the discussions or sign up if you are not already a MERLOT member.