Aim
Reproducibility of experiments is crucial to foster an atmosphere of open, reusable and trustworthy research. To improve and reward reproducibility and to give more visibility and credit to the effort of tool developers in our community, authors of accepted papers will be invited to submit possible artifacts associated with their paper for evaluation, and based on the level of reproducibility they will be awarded one or more badges.
The goals of the artifact evaluation are manifold. We want to encourage authors to provide more substantial evidence to their papers and to reward authors who aim for reproducibility of their results, and therefore create artifacts. Also, we want to give more visibility and credit to the effort of tool developers in our community. Furthermore, we want to simplify the independent replication of results presented in the paper and to ease future comparisons with existing approaches.
Artifact submission is optional. Papers that are successfully evaluated will be awarded one or more artifact badges, but the result of the artifact evaluation will not alter the paper’s acceptance decision. We aim to assess the artifacts themselves and not the quality of the research linked to the artifact, which has been assessed by the iFM 2024 program committee already. The goal of our review process is to be constructive and to improve the submitted artifacts. Only if an artifact cannot be improved to achieve sufficient quality in the given time frame or if it is inconsistent with the paper, it should be rejected.
Important Dates
All dates are Anywhere on Earth (AoE, UTC -12).
- 12 August 2024 – Artifact registration deadline
- 19 August 2024 – Artifact submission deadline
- 16 September 2024 – Final artifact notification
Reviewing Criteria
All artifacts are evaluated by the artifact evaluation committee. Each artifact will be reviewed by at least two committee members. Reviewers will read the accepted paper and explore the artifact to evaluate how well the artifact supports the claims and results of the paper.
Important: There will be no smoke-test, as in our experience it is not useful if the authors tested their artifact properly.
Available Badge
The available badge is the bare minimum that an artifact should allow: a binary with instructions how to run a problem are sufficient for this badge.
Functional Badge
The official description has more details, but you should reproduce a part of the paper in the virtual environment you picked (Docker / VM).
Reusable Badge
The artifact needs a description how to reproduce the work outside of the virtual environment you picked (Docker / VM) and even better describe how to use the tool for another purpose.
Assessment Phase
In the assessment phase, reviewers will try to reproduce any experiments or activities and evaluate the artifact w.r.t. the questions detailed above. The final review is communicated using EasyChair.
Awarding Authors may use all granted badges on the title page of the respective paper. iFM awards the evaluation and availability badges of EAPLS. The availability badge will be awarded if the artifact is made permanently and publicly available and has a DOI. We recommend services like Zenodo or figshare for this. Also, the artifact needs to be relevant and add value beyond the text in the paper.
The evaluation badge has two levels, functional and reusable. Each successfully evaluated artifact receives at least the functional badge. The reusable badge is granted to artifacts of very high quality.
Artifacts that are not exercisable, as, for example, protocols used for empirical studies, will be evaluated only according to the Available badge, as Functional and Reusable badges are not applicable.
Artifact Submission
An artifact submission consists of
- An abstract, to be written directly in EasyChair, that:
- summarizes the artifact
- explains its relation to the paper, describes which badges the authors submit for.
- A PDF file of the most recent version of the accepted paper, which may differ from the submitted version to take reviewers’ comments into account. Please also look at the Artifact Packaging Guidelines below for more detailed information about the contents of the artifact.
- A DOI of the artifact, as a link to a repository that provides a DOI such that Zenodo, figshare, or Dryad.
- We need the checksum to ensure the integrity of your artifact. You can generate the checksum using the following command-line tools:
1 2 3 | sha256sum <file> # Linux CertUtil -hashfile <file> SHA256 # Windows shasum -a 256 <file> # macOS |
The abstract and the PDF file of your paper must be submitted via EasyChair:
If you cannot submit the artifact as requested or encounter any other difficulties in the submission process, please contact the artifact evaluation chairs prior to submission.
Packaging Guidelines
Your artifact should contain the following elements:
- the main artifact, i.e., data, software, libraries, scripts, etc. required to replicate the results of your paper and any additional software required by your artifact including an installation description in the README. We recommend using a Docker image, but you can also use an OVA file based on Ubuntu 24.04 LTS.
- A LICENSE file describing the rights. Your license needs to allow the artifact evaluation committee members to download and evaluate the artifact, e.g., download, use, execute, and modify the artifact for the purpose of artifact evaluation. Please refer to typical open-source licenses. Artifacts without an open-source license are also accepted, but a type of license needs to be specified, which allows the committee to assess the artifact. For quick help about possible licenses, visit choosealicense.com.
- The README file should introduce the artifact to the user, i.e., describe what the artifact does, and guide the user through the installation, set up tests, and replication of your results. Ideally, it should contain:
- The structure and content of the artifact.
- The steps to setup your artifact (do not assume that all reviewers are familiar with Docker).
- What claims the artifact is reproducing with evaluation scripts to compare it to the paper.
- A time estimate how long reproducing takes. It should not take more than 8 hours (2 threads) to reproduce the entire set of experiments.
We provide an example README based on the CAV24-AE README available on Zenodo: doi.org/10.5281/zenodo.11118824.
In case your experiments cannot be replicated inside Docker or a VM, please contact the Artifact Evaluation Committee chairs before submission. Possible reasons may include the need for special hardware (FPGAs, GPUs, clusters, robots, etc.), software licensing issues. In any case, you are encouraged to submit a complete artifact. This way, the reviewers have the option to replicate the experiments in the event they have access to the required resources.
In case your artifact requires more than 8 hours (or more memory) to reproduce, please provide:
- The full set of log files you obtained.
- The limited set of log files you obtained when running the artifact. This should not be the directory where the log files are produced by your experiments in the artifact!
- The directory where the log files are produced by your experiments in the artifact.
- Your scripts should allow to reproduce the results for the subset from the artifact and the full set.
The structure of those three directories should be similar so that the reviewers can compare the log files (while being aware than the cluster might or might not be faster than the machine the artifact is executed on).
If you decide to use a VM, make sure that it does not use the internet by including all packages (therefore, no sudo apt install XXX
is allowed inside a VM).
Recommendations for Authors
- We recommend preparing your artifact in such a way that any computer science expert without dedicated expertise in your field can use your artifact, especially replicate your results. For example, keep the evaluation process simple, provide easy-to-use scripts, and a detailed README document.
- Furthermore, the artifact and its documentation should be self-contained.
- However, we recommend to not make the README longer than necessary. If your tool uses a standard input language (like DIMACS or SMT-LiB) and a standard output (
s SATISFIABLE
ors UNSATISFIABLE
), a link to a description is enough. - We do not expect a full execution of your artifact if the runtime is more than 8 hours (or even thousands of hours if actually a beefy cluster is needed)
- Indicate precisely in the paper what you are reproducing (like Table 4 or Figure 5).
Recommendations for Reviewers
As there is no smoke test, the reviewers might have to debug a little bit. Some important remarks:
Docker uses a lot of disk space. We suggest a
1
docker system prune -a --volumes
to delete all files (we have observed removal of 135GB (!) while preparing an artifact previously).
- Docker requires root rights and we expect that the Docker commands are run by a user with sudo rights.
- In case you have Docker errors, please search them in your favourite search engine (for “no space left on disk” please read the previous point).
- We do not expect that you spend more than 10 min in debugging.
Artifact Evaluation Committee
The artifact evaluation chairs are:
- Daniela Kaufmann, TU Wien, Austria
- Mathias Fleury, University of Freiburg, Germany
The artifact evaluation program committee members are:
- Bruno Andreotti, Universidade Federal de Minas Gerais
- Chen Chen, The Hong Kong University of Science and Technology (Guangzhou)
- César Cornejo, Universidad Nacional de Río Cuarto
- Mario Frank, University of Potsdam
- Bernhard Gstrein, University of Freiburg
- Thomas Hader, TU Wien
- Simone Heisinger, Johannes Kepler University Linz
- Maurice Laveaux, Eindhoven University of Technology
- Yong Li, University of Liverpool
- Anik Momtaz, Michigan State University
- Danilo Pianini, University of Bologna
- Florian Pollitt, University of Freiburg
- Mouhammad Sakr, University of Luxembourg
- Dimitrios Thanos, Leiden University
- Dieter Vandesande, Vrije Universiteit Brussel
- Andy Oertel, Lund University and University of Copenhagen
- Alex Ozdemir, Stanford University