IN GENERAL

One does not have to iterate over folders of patient files anymore!
- All patients are harmonized within the large .parquet files within reprodICU_files, with a smaller DEMO dataset (made from the MIMIC-III/-IV / eICU demos) being available within reproDEMO
Any kind of data manipulation / selection can now happen lazily without the need for processing the full database every time
- Use polars instead of pandas
Due to much clearer declaration of processing steps (mostly via .pipe()), resulting code is now much more readable
- One can also easily precalculate pipe-able functions for the full dataset, such that the save results may also be loaded lazily (thus increasing efficiency even more)
Any imputation or preprocessing that happens after the first basic harmonization step is now explicitly declared and can be reproduced independently of a full database rebuild
Some variables and their locations have moved:
- There is no table flats anymore
  - Also height, weight, age etc. are just given as raw values (or imputed / winsorized ones when using processing steps)
- The table labels was renamed patient_information and now contains additional information (where available) such as
  - ethnicity
  - pre-ICU stay duration (in days)
  - in-ICU mortality
  - in-hospital mortality
  - post-ICU mortality (truthy, in days)
  - admission type
  - admission urgency
  - admission category
  - specialty
- The table timeseries was split according to timeseries categories and approximate frequency of logging to reduce needed space
  - vitals contains values such as heart rate, blood pressure and O2 saturation by pulseoximetry
  - respiratory contains values concerning ventilation
  - labs contains lab values (incl. blood gases)
  - inout contains intake / output volumes

WHAT is WHERE?

structure of laboratory data