1st International Workshop on
Comparative Empirical Evaluation of Reasoning Systems
held as part of IJCAR 2012
June 30, 2012

Programme, Proceedings, and Invited Speakers

The programme is now available. We are happy to feature invited talks by:

You can download the proceedings of the workshop (single PDF file). We also have CEUR-WS proceedings (bibtex entry).

Motivation and Goal

Benchmark libraries and competitions are two popular approaches to comparative empirical evaluation of reasoning systems. It has become accepted wisdom that regular comparative evaluation of reasoning systems helps focus research, identify relevant problems, bolster development, and advance the field in general. The number of competitions has been rapidly increasing lately. At the moment, we are aware of about a dozen benchmark collections and two dozen competitions for reasoning systems of different kinds. We feel that it is time to compare notes.

    What are the proper empirical approaches and criteria for effective comparative evaluation of reasoning systems? What are the appropriate hardware and software environments? How to assess usability of reasoning systems? How to design, acquire, structure, publish, and use benchmarks and problem collections?

    The issue has received a new dimension recently, as systems that have a strong interactive aspect have entered the arena. Three competitions dedicated to deductive software verification within the last two years show a dire need of comparative empirical evaluation in this area. At the same time, it remains largely unclear how to systematically evaluate performance and usability of such systems and how to separate technical aspects and the human factor.

    The workshop aims to advance comparative empirical evaluation by bringing together current and future competition organizers and participants, maintainers of benchmark collections, as well as practitioners and the general scientific public interested in the topic.

    Furthermore, the workshop intends to reach out to researchers specializing in empirical studies in computer science outside of automated reasoning. We expect such connections to be fruitful but unlikely to come about on their own.

Important Dates

  • Submission deadline: April 16th, 2012 May 1st (extended)
  • Notification of acceptance: May 21st, 2012 May 29th
  • Final version due: May 28th, 2012 June 5th
  • Workshop: June 30, 2012


The scope of the workshop includes (but is not limited to) topics such as the following. All topics apply to comparative evaluation of reasoning systems. Reports on evaluating a single system (such as case studies done with a particular system) are not in scope of the workshop.
  • Comparative case studies
  • Criteria for empirical evaluation
  • Design, organisation, and conclusions from competitions
  • Design, acquisition, execution, and dissemination of benchmarks
  • Experience reports
  • Hardware and software environments
  • Inter-community collaboration
  • Languages and language standards
  • Practitioner perspectives
  • Software and code quality evaluation
  • Surveys and questionnaires
  • Usability studies

Submissions and Proceedings

Papers can be submitted either as regular papers (6-15 pages in LNCS style) or as discussion papers (2-4 pages). Regular papers should present previously unpublished work (completed or in progress), including proposals for system evaluation, descriptions of benchmarks, and experience reports. Discussion papers are intended to initiate discussions, should address controversial issues, and may include provocative statements.

All submitted papers will be refereed by the program committee and will be selected in accordance with the referee reports. The collection of accepted papers will be distributed at the workshop and will also be published in the CEUR Workshop Proceedings series.

Submission upload is now closed.

Program Committee


You can reach the workshop organizers at compare2012@verifythis.org. With all other questions about this page, contact Vladimir Klebanov.