HubValidations is an R package designed to validate submissions to a forecast hub.
Most of the time, validations will be set up to run automatically via GitHub Actions when a pull request is opened in a hub GitHub repository.
However, you might want to validate your submission locally, either to check that everything is okay in advance, or to investigate the reason of a failure in the automated checks.
By default, validation functions from HubValidations (functions
starting with validate_...()
) will print all the checks
performed and passing checks will be shown on lines starting with
✓
while failing check lines will start with
x
.
To reproduce the checks done automatically when you open a pull
request, the simplest option is to place yourself at the root of the hub
GitHub repository, and call validate_repository()
or
validate_pr()
.
validate_repository(
data_folder = system.file("testdata", "data-processed", package = "HubValidations"),
metadata_folder = system.file("testdata", "model-metadata", package = "HubValidations"),
data_schema = system.file("testdata", "schema-data.yml", package = "HubValidations"),
metadata_schema = system.file("testdata", "schema-metadata.yml", package = "HubValidations")
)
#> ✔ example-model: There is exactly one metadata file
#>
#> ✔ 2021-07-19-example-model.csv: Folder name is identical to model name in data file
#>
#> ✔ 2021-07-26-example-model.csv: Folder name is identical to model name in data file
#>
#> ✔ example-model.yml: Metadata file is using the `.yml` extension
#>
#> ✔ example-model.yml: Metadata filename is the same as `model_abbr`
#>
#> ✔ example-model.yml: Metadata file is consistent with schema specifications
#>
#> ✔ 2021-07-19-example-model.csv: Filename is formed of a date and a model name, separated by an hyphen
#>
#> ✔ 2021-07-19-example-model.csv: `forecast_date` column is identical to the date in filename
#>
#> ✔ 2021-07-19-example-model.csv: Data is formed of the expected columns with correct type
#>
#> ✔ 2021-07-26-example-model.csv: Filename is formed of a date and a model name, separated by an hyphen
#>
#> ✔ 2021-07-26-example-model.csv: `forecast_date` column is identical to the date in filename
#>
#> ✔ 2021-07-26-example-model.csv: Data is formed of the expected columns with correct type
#>
#> ✔ example-model2: There is exactly one metadata file
#>
#> ✔ 2021-07-19-example-model2.csv: Folder name is identical to model name in data file
#>
#> ✔ 2021-07-26-example-model2.csv: Folder name is identical to model name in data file
#>
#> ✔ example-model2.yml: Metadata file is using the `.yml` extension
#>
#> ✔ example-model2.yml: Metadata filename is the same as `model_abbr`
#>
#> ✔ example-model2.yml: Metadata file is consistent with schema specifications
#>
#> ✔ 2021-07-19-example-model2.csv: Filename is formed of a date and a model name, separated by an hyphen
#>
#> ✔ 2021-07-19-example-model2.csv: `forecast_date` column is identical to the date in filename
#>
#> ✔ 2021-07-19-example-model2.csv: Data is formed of the expected columns with correct type
#>
#> ✔ 2021-07-26-example-model2.csv: Filename is formed of a date and a model name, separated by an hyphen
#>
#> ✔ 2021-07-26-example-model2.csv: `forecast_date` column is identical to the date in filename
#>
#> ✔ 2021-07-26-example-model2.csv: Data is formed of the expected columns with correct type
#>
#> ✔ example-model.yml: There is only one primary model for a given team
You can check a specific model instead of checking the whole
repository by running validate_model()
. In this case, you
will also need to specify the path to the schema files used for the
validation. Usually, these files will be at the root of the folder
containing each team folder. In the case of the European Covid-19
Forecast Hub, this folder is named data-processed
.
validate_model(
"example-model",
data_folder = system.file("testdata", "data-processed", package = "HubValidations"),
metadata_folder = system.file("testdata", "model-metadata", package = "HubValidations"),
data_schema = system.file("testdata", "schema-data.yml", package = "HubValidations"),
metadata_schema = system.file("testdata", "schema-metadata.yml", package = "HubValidations")
)
#> ✔ example-model: There is exactly one metadata file
#>
#> ✔ 2021-07-19-example-model.csv: Folder name is identical to model name in data file
#>
#> ✔ 2021-07-26-example-model.csv: Folder name is identical to model name in data file
#>
#> ✔ example-model.yml: Metadata file is using the `.yml` extension
#>
#> ✔ example-model.yml: Metadata filename is the same as `model_abbr`
#>
#> ✔ example-model.yml: Metadata file is consistent with schema specifications
#>
#> ✔ 2021-07-19-example-model.csv: Filename is formed of a date and a model name, separated by an hyphen
#>
#> ✔ 2021-07-19-example-model.csv: `forecast_date` column is identical to the date in filename
#>
#> ✔ 2021-07-19-example-model.csv: Data is formed of the expected columns with correct type
#>
#> ✔ 2021-07-26-example-model.csv: Filename is formed of a date and a model name, separated by an hyphen
#>
#> ✔ 2021-07-26-example-model.csv: `forecast_date` column is identical to the date in filename
#>
#> ✔ 2021-07-26-example-model.csv: Data is formed of the expected columns with correct type
#>
Alternatively, you can check only the metadata file with
validate_model_metadata()
. You will have to specify the
path to the metadata schema file
validate_model_metadata(
system.file("testdata", "model-metadata", "example-model.yml", package = "HubValidations"),
metadata_schema = system.file("testdata", "schema-metadata.yml", package = "HubValidations")
)
#> ✔ example-model.yml: Metadata file is using the `.yml` extension
#>
#> ✔ example-model.yml: Metadata filename is the same as `model_abbr`
#>
#> ✔ example-model.yml: Metadata file is consistent with schema specifications
#>
Alternatively, you can check only the projection file with
validate_model_data()
. You will have to specify the path to
the metadata schema file
validate_model_data(
system.file("testdata", "data-processed", "example-model", "2021-07-26-example-model.csv", package = "HubValidations"),
data_schema = system.file("testdata", "schema-data.yml", package = "HubValidations")
)
#> ✔ 2021-07-26-example-model.csv: Filename is formed of a date and a model name, separated by an hyphen
#>
#> ✔ 2021-07-26-example-model.csv: `forecast_date` column is identical to the date in filename
#>
#> ✔ 2021-07-26-example-model.csv: Data is formed of the expected columns with correct type
#>