Active measurement is widely used, many public data sets are generated continually for research use. Given several documented incidents of problems on RIPE Atlas and other platforms, how do we know that the data are any good? Due to the fact that host system and last-hop network effects can cause poor measurement fidelity, how can one determine “bad" time segments (i.e., bad measurements)? Answer: collect metadata. Despite calls to collect and store metadata in prior IMC papers and related literature, it is only done in ad hoc ways or not at all. In short, there are no de facto tools or methods for collecting metadata. The goals of our work are to provide a critical, expanded perspective on measurement results and to improve the opportunity for reproducibility of results!
Inspired by Grosvenor et al., all the scripts that we used for this project are available here. We hope that many researchers in the community do this in future.