Folder structure

Assumes the following folder structure. It is important that logger files are named record_NN.h5 or record_NNN.h5

Note that, because the data format to communicate parameters used is a Pickle file, the filenames MUST be indexed at 0

SIMULATION_NAME/
    parameter_grid.json #Describes each run
    record_00.h5
    record_01.h5
    ...
    record_NN.h5 # Each logger run
    sites.geojson # Polygons of the regions

From this information, you can specify an output_directory where summaries will automatically be available for frontend analysis.

public/demo/projects/SIMULATION_NAME/
    metadata.json
    summary_000.csv
    summary_001.csv
    ...
    summary_NNN.csv

We are provided a parameter_grid.json file that looks like the following:

{
    "pub": [
        0.0953169,
        0.521456,
        0.40569099999999997,
        0.484659,
        0.138482
    ],
    "grocery": [
        0.387384,
        0.452953,
        0.548852,
        0.042028699999999995,
        0.21261799999999997
    ], ...
}

(In this case, there are 5 runs and each run takes the parameter listed. This makes it tricky to do a grid search in the interface since many values will be distinct...)

Check available projects

Because extracting from the records can take a while, we don't want to overwrite an existing project unless indicated

Create the Summary CSVs

Take the record_**.h5 and convert them to CSVs the frontend can parse

These record files can be on the order of 8GB and summarizing each can take about 45 minutes. It works, though it is not the most efficient or parallelized implementation

Creating the `metadata.json`

We want to convert the provided parameter_grid.json file into a metadata.json file (e.g., below) that also includes some basic summary statistics from the project. This has the format:

{
    "description": "Learning center comparison",
    "parameters_varied": [
        "indoor_beta",
        "outdoor_beta",
        "household_beta",
        "learning_centers"
    ],
    "run_parameters": {
        "1": {
            "learning_centers": false,
            "household_beta": 0.2,
            "indoor_beta": 0.45,
            "outdoor_beta": 0.05
        },
        "2": {
            "learning_centers": false,
            "household_beta": 0.2,
            "indoor_beta": 0.55,
            "outdoor_beta": 0.05
        }, ...
    },
    "all_regions": [
        "CXB-201",
        "CXB-202", ...
    ], 
    "all_timestamps": [
        "2020-05-01",
        "2020-05-02", ...
    ], 
    "all_fields": [
        "currently_dead",
        "currently_in_hospital_0_12", ...
    ],
    "field_statistics": {
        "n_infections_in_communal": {
            "max": 132.0,
            "min": 0.0
        },
        "recovered": {
            "max": 1937.0,
            "min": 0.0
        }, ...
    }

This involves restructuring the provided parameter grids and parsing the new summary_**.csvs for extents of each field.

Copying the `sites.geojson`

This part is a bit simpler. We need to copy the sites.geojson file from the provided records to the output directory.

Note: some geojson files may be very large. This is the place to reduce the size to something more reasonable yet still functional.

Also, some geojson files for this project have been annotated with SSID as the 'property' that describes each region. Others are annotated with the region key. We need to unify this interface

Fixing the sites.geojson

We need to unify the geojson file a bit. First, the files are terribly large with high resolution (making it very slow to load in the frontend), and the multipolygons are rendering incorrectly.

Initializing a new project

Folder structure

Check available projects

`init_available_projects`[source]

Create the Summary CSVs

`summarize_h5`[source]

Creating the `metadata.json`

`pgrid_to_run_parameters`[source]

`collect_statistics`[source]

Copying the `sites.geojson`

Fixing the sites.geojson

`fix_geojson`[source]

Bundle as Script

`main`[source]

Initializing a new project

Folder structure

Check available projects

init_available_projects[source]

Create the Summary CSVs

summarize_h5[source]

Creating the metadata.json

pgrid_to_run_parameters[source]

collect_statistics[source]

Copying the sites.geojson

Fixing the sites.geojson

fix_geojson[source]

Bundle as Script

main[source]

`init_available_projects`[source]

`summarize_h5`[source]

Creating the `metadata.json`

`pgrid_to_run_parameters`[source]

`collect_statistics`[source]

Copying the `sites.geojson`

`fix_geojson`[source]

`main`[source]