GENERAL

There are two important parts to the workflow:

Cohort Identification
- Goal: Define and identify the relevant cohort to analyse.
  - Output: A list of Global ICU Stay IDs corresponding to the selected cohort.
- Apply inclusion and exclusion criteria as per the analysis plan.
- Document the cohort selection process for transparency and reproducibility.
Analysis
- Goal: Execute relevant analyses.
- Implement statistical models, visualizations, and other analytical methods.
- Include comments and documentation for clarity and reproducibility.

Although it might seem self-explanatory, it helps to think the two things separately, as reprodICU is explicitly designed not to restrict itself to Python only.

The tutorials (and most support) for reprodICU will be given when working in Python, however, if wanted, one could identify the relevant patients in reprodICU, aggregate the relevant variables, export them to CSV (or another format) and work with the data in R, Stata, SPSS or whatever software is preferred.

`TODO`: LINK TO JUPYTER NOTEBOOK

`TODO`: LINK TO MARIMO NOTEBOOK

1. Include / Exclude Patients

Create boolean masks on the Global ICU Stay ID in the table to create a set of included patients
Boolean masks may in practice be created in a step-down procedure (i.e. evaluate only the patients that successfully passed the previous selection criteria for computational efficiency), however the underlying code should be independent of the order of inclusion/exclusion operations.

2. Determine Exposure

Define one or multiple concepts for the relevant exposure for the study (relevant code should be written in a way that allows for pre-computation on the complete dataset!)

3. Determine Covariates

Define / use established concepts for relevant covariates for the study
Common covariates such as Elixhauser Comorbidity Index are / should be made available as precalculated dataframes for the complete dataset.

4. Aggregate the Data

Include / exclude based on step 1
Calculate step 2