HubValidations is an R package designed to validate submissions to a forecast hub.

Most of the time, validations will be set up to run automatically via GitHub Actions when a pull request is opened in a hub GitHub repository.

However, you might want to validate your submission locally, either to check that everything is okay in advance, or to investigate the reason of a failure in the automated checks.

Interpreting the output of HubValidations

By default, validation functions from HubValidations (functions starting with validate_...()) will print all the checks performed and passing checks will be shown on lines starting with while failing check lines will start with x.

Checking all your changes

To reproduce the checks done automatically when you open a pull request, the simplest option is to place yourself at the root of the hub GitHub repository, and call validate_repository() or validate_pr().

validate_repository(
  data_folder = system.file("testdata", "data-processed", package = "HubValidations"),
  metadata_folder = system.file("testdata", "model-metadata", package = "HubValidations"),
  data_schema = system.file("testdata", "schema-data.yml", package = "HubValidations"),
  metadata_schema = system.file("testdata", "schema-metadata.yml", package = "HubValidations")
)
#>  example-model: There is exactly one metadata file 
#> 
#>  2021-07-19-example-model.csv: Folder name is identical to model name in data file 
#> 
#>  2021-07-26-example-model.csv: Folder name is identical to model name in data file 
#> 
#>  example-model.yml: Metadata file is using the `.yml` extension 
#> 
#>  example-model.yml: Metadata filename is the same as `model_abbr` 
#> 
#>  example-model.yml: Metadata file is consistent with schema specifications 
#>  
#>  2021-07-19-example-model.csv: Filename is formed of a date and a model name, separated by an hyphen 
#> 
#>  2021-07-19-example-model.csv: `forecast_date` column is identical to the date in filename 
#> 
#>  2021-07-19-example-model.csv: Data is formed of the expected columns with correct type 
#>  
#>  2021-07-26-example-model.csv: Filename is formed of a date and a model name, separated by an hyphen 
#> 
#>  2021-07-26-example-model.csv: `forecast_date` column is identical to the date in filename 
#> 
#>  2021-07-26-example-model.csv: Data is formed of the expected columns with correct type 
#>  
#>  example-model2: There is exactly one metadata file 
#> 
#>  2021-07-19-example-model2.csv: Folder name is identical to model name in data file 
#> 
#>  2021-07-26-example-model2.csv: Folder name is identical to model name in data file 
#> 
#>  example-model2.yml: Metadata file is using the `.yml` extension 
#> 
#>  example-model2.yml: Metadata filename is the same as `model_abbr` 
#> 
#>  example-model2.yml: Metadata file is consistent with schema specifications 
#>  
#>  2021-07-19-example-model2.csv: Filename is formed of a date and a model name, separated by an hyphen 
#> 
#>  2021-07-19-example-model2.csv: `forecast_date` column is identical to the date in filename 
#> 
#>  2021-07-19-example-model2.csv: Data is formed of the expected columns with correct type 
#>  
#>  2021-07-26-example-model2.csv: Filename is formed of a date and a model name, separated by an hyphen 
#> 
#>  2021-07-26-example-model2.csv: `forecast_date` column is identical to the date in filename 
#> 
#>  2021-07-26-example-model2.csv: Data is formed of the expected columns with correct type 
#>  
#>  example-model.yml: There is only one primary model for a given team

Checking only your model

You can check a specific model instead of checking the whole repository by running validate_model(). In this case, you will also need to specify the path to the schema files used for the validation. Usually, these files will be at the root of the folder containing each team folder. In the case of the European Covid-19 Forecast Hub, this folder is named data-processed.

validate_model(
  "example-model",
  data_folder = system.file("testdata", "data-processed", package = "HubValidations"),
  metadata_folder = system.file("testdata", "model-metadata", package = "HubValidations"),
  data_schema = system.file("testdata", "schema-data.yml", package = "HubValidations"),
  metadata_schema = system.file("testdata", "schema-metadata.yml", package = "HubValidations")
)
#>  example-model: There is exactly one metadata file 
#> 
#>  2021-07-19-example-model.csv: Folder name is identical to model name in data file 
#> 
#>  2021-07-26-example-model.csv: Folder name is identical to model name in data file 
#> 
#>  example-model.yml: Metadata file is using the `.yml` extension 
#> 
#>  example-model.yml: Metadata filename is the same as `model_abbr` 
#> 
#>  example-model.yml: Metadata file is consistent with schema specifications 
#>  
#>  2021-07-19-example-model.csv: Filename is formed of a date and a model name, separated by an hyphen 
#> 
#>  2021-07-19-example-model.csv: `forecast_date` column is identical to the date in filename 
#> 
#>  2021-07-19-example-model.csv: Data is formed of the expected columns with correct type 
#>  
#>  2021-07-26-example-model.csv: Filename is formed of a date and a model name, separated by an hyphen 
#> 
#>  2021-07-26-example-model.csv: `forecast_date` column is identical to the date in filename 
#> 
#>  2021-07-26-example-model.csv: Data is formed of the expected columns with correct type 
#> 

Checking only your metadata file

Alternatively, you can check only the metadata file with validate_model_metadata(). You will have to specify the path to the metadata schema file

validate_model_metadata(
  system.file("testdata", "model-metadata", "example-model.yml", package = "HubValidations"),
  metadata_schema = system.file("testdata", "schema-metadata.yml", package = "HubValidations")
)
#>  example-model.yml: Metadata file is using the `.yml` extension 
#> 
#>  example-model.yml: Metadata filename is the same as `model_abbr` 
#> 
#>  example-model.yml: Metadata file is consistent with schema specifications 
#> 

Checking only your projection file

Alternatively, you can check only the projection file with validate_model_data(). You will have to specify the path to the metadata schema file

validate_model_data(
  system.file("testdata", "data-processed", "example-model", "2021-07-26-example-model.csv", package = "HubValidations"),
  data_schema = system.file("testdata", "schema-data.yml", package = "HubValidations")
)
#>  2021-07-26-example-model.csv: Filename is formed of a date and a model name, separated by an hyphen 
#> 
#>  2021-07-26-example-model.csv: `forecast_date` column is identical to the date in filename 
#> 
#>  2021-07-26-example-model.csv: Data is formed of the expected columns with correct type 
#>