Help others to build upon the technical contributions of your article!
The Artifact Evaluation (AE) process is a service provided by the Programming Journal to help authors of accepted articles (including accepted articles subject to minor revisions) to extend the reach of their work so future researchers can build on and compare with that work. The results can be of any form (implementations, data, analysis results). The AEC will read the article and explore the artifact to give feedback about how well the artifact supports the article and how easy it is for future researchers to use it.
Submissions to AE are voluntary. Articles evaluated positively in the AE process will receive a seal of approval to be included on the first page of the article. If authors agree to make their artifacts publicly available, we will publish and archive the artifact alongside their article via Zenodo as part of the Zenodo ‹Programming› community. Articles will include references to their artifacts and artifacts references to the corresponding articles.
Three badges are available, based on the ACM artifact reviewing badges: Functional, Reusable, and Available (details below).
Please consult the journal’s submission timeline for the timeline of the artifact evaluation.
The artifact is evaluated in relation to the expectations set by the article. For an artifact to be accepted, it must support all the main claims made in the article. Thus, in addition to just running the artifact, evaluators will read the article and may try to modify provided inputs or otherwise slightly generalize the use of the artifact from the article in order to test its limits.
Artifacts should be:
The Artifact Evaluation Committee (AEC) takes the position of future researchers and then asks themselves how much this artifact would help them. Please see details of the outcomes of artifact evaluation (badges) for further guidance.
All accepted articles (including accepted articles subject to minor revisions) are eligible to submit artifacts.
Your submission should consist of three parts, all in one file:
README.mdof your artifact,
sha256sumcommand-line tool to generate the hash) or
git reflog --no-abbrev)
The URL must be a public Google Drive, Dropbox, GitHub, Bitbucket, or GitLab URL, to help protect the anonymity of the reviewers.
Artifacts do not need to be anonymous; reviewers will be aware of author identities.
Submit your artifact documentation via the online submission system.
Please upload a single file that includes the URL to your artifact, the hash of the artifact at submission time, and the content of the README file.
Your README.md should consist of three parts:
Your “Getting Started Guide” should contain setup instructions (including, for example, a pointer to the VM player software, its version, passwords if needed, etc.) and basic testing of your artifact that you expect a reviewer to be able to complete in 30 minutes or less. Reviewers will follow all the steps in the guide during an initial kick-the-tires phase. The Getting Started Guide should be as simple as possible, and yet it should stress the key elements of your artifact. Anyone who has successfully followed the Getting Started Guide should have no technical difficulties with your artifact.
Your artifact’s documentation should include the following:
Examples of unsupported claims: Performance claims cannot be reproduced in VM, authors are not allowed to redistribute specific benchmarks, etc.
Artifact reviewers can use this documentation to base their reviews/evaluations on these specific claims. The reviewers will still also consider whether the provided evidence is adequate to support claims that the artifact works as expected.
The “Step-by-Step Instructions” explain how to reproduce any experiments or other activities that support the individual claims in your article. Write this for readers who have a deep interest in your work and are studying it to improve it or compare against it. If your artifact runs for more than a few minutes to exhibit the desired effect, point this out, note how long it is expected to run (roughly), and explain how to run it on smaller inputs. Reviewers may choose to run on smaller or larger inputs depending on available hardware.
Where appropriate, include descriptions of and links to files (included in the archive) that represent expected outputs (e.g., the log files expected to be generated by your tool on the given inputs); if there are warnings that are safe to be ignored, explain which ones those are.
When packaging your artifact, please keep in mind: a) how accessible you are making your artifact to other researchers and b) the fact that the AEC members will have a limited time in which to make an assessment of each artifact.
We ask that you make the artifact available as a Docker image or a virtual machine (VM) image in OVF / OVA format containing the installed artifact and all of the necessary libraries installed. A VM provides an easily reproducible environment. It also helps the AEC have confidence that errors or other problems cannot cause harm to their machines.
You should make your artifact available as a single archive file. Please use a widely available compressed archive format such as ZIP (
.zip), tar and gzip (
.tgz), or tar and bzip2 (
.tbz2). Please use open formats for documents and preferably CSV or JSON for data.
For ensuring quality packaging, we strongly recommend testing your own instructions on a fresh machine (or VM), following exactly the instructions you have given.
Authors submitting machine-checked proof artifacts should consult Marianna Rapoport’s Proof Artifacts: Guidelines for Submission and Reviewing.
While publicly available artifacts are often easier to review and considered to be in the best interest of open science, artifacts are not required to be public or open-source. Artifact reviewers will be instructed that the artifacts are for use only for artifact evaluation, that submitted versions of artifacts must not be made public by reviewers, and that copies of artifacts must not be kept beyond the review period. There is an additional badge specifically for making artifacts available in reliable locations (see below) and we strongly encourage authors of accepted artifacts to pursue it, but it is a separate process from evaluation of functionality and not required.
After submitting their artifact, there is a short window of time in which the reviewers will work through only the kick-the-tires instructions and upload preliminary reviews indicating whether or not they were able to get those 30-or-so minutes of instructions working. At that point, the preliminary reviews will be shared with authors, who may make modest updates and corrections in order to resolve any issues the reviewers encountered.
We allow additional rounds of interaction with reviewers in the case new issues are discovered after the kick-the-tires window. This is in the hope that artifacts that would be just short of being Functional can have more opportunities to make small corrections. After the kick-the-tires response, reviewers will be able to post author-visible comments with questions for authors at any time, and authors may respond to those reviewer questions or requests. Such interaction is on the reviewers’ initiative; authors will be asked not to post unless in response to reviewer requests.
The AEC evaluates each artifact for the awarding of Functional or Reusable badges:
This is the basic “accepted” outcome for an artifact. An artifact can be awarded a Functional badge if the artifact supports all claims made in the article, possibly excluding some claims if there are very good reasons they cannot be supported. In the ideal case, an artifact with this designation includes all relevant code, dependencies, and input data (e.g., benchmarks) and the artifact’s documentation is sufficient for reviewers to reproduce the exact results described in the article. If the artifact is claimed to outperform a related system in some way (in time, accuracy, etc.) and that related system was used to generate new numbers for the article (e.g., an existing tool was run on new benchmarks not considered by the corresponding publication), artifacts should include a version of that related system and instructions for reproducing the numbers used for comparison as well. If the alternative tool crashes on a subset of the inputs, simply note this expected behavior.
Deviations from this ideal must be for good reason. A non-exclusive list of justifiable deviations includes:
In some cases, the artifact may require specialized hardware (e.g., a CPU with a particular new feature, a specific class of GPU, or a cluster of GPUs). For such cases, authors should contact the Artifact Evaluation Co-Chairs (Patrick Rein and Fabio Niephaus) as soon as possible after their articles’ acceptance notification to work out how to make these possible to evaluate. One possible outcome is that the authors of the artifact requiring specialized hardware pay for a cloud instance with the hardware, which reviewers could access remotely.
A Reusable badge is given when the artifact not only satisfies the requirements to be Functional but additionally reviewers feel the artifact is particularly well packaged, documented, designed, etc. to support future research that might build on the artifact. For example, if it seems relatively easy for others to reuse this directly as the basis of a follow-on project, the AEC may award a Reusable badge.
For binary-only artifacts to be considered Reusable, it must be possible for others to directly use the binary in their own research, such as a JAR file with very high-quality client documentation for someone else to use it as a component in their own project.
Artifacts with source can be considered reusable:
Artifacts given the Functional or Reusable badge are generally referred to as accepted.
This badge is automatically earned by artifacts that are made publicly available in an archival location. This can be earned by an artifact including those not reviewed by the AEC and those reviewed but not found Functional during reviewing.
There are two routes for this:
Common issues in the kick-the-tires phase are:
Conflicts of interests for AEC members are handled by the chairs. Conflicts of interest involving one of the two AEC chairs are handled by the other AEC chair, or the associate editor of the current volume if both chairs are conflicted.
The description of this process is based on documents from similar ECOOP and SPLASH AE processes. Thanks for creating and sharing these documents go to Benjamin Greenman, Ana Milanova, Colin Gordon, and Anders Møller.