PatientProfiles tutorial

PatientProfiles

Scope

PatientProfiles (developer focus)

To add characteristics to tables.
To identify intersections with cohorts, concepts and omop tables.
Summarise data in summarised_result format

CohortCharacteristics (user focus)

Summarise cohorts:
- Charcateristics
- Large scale characteristics
- Cohort overlap
- Cohort timing
Create visualisations (tables and figures).

addSex()

To create a new column with the sex of the individual. The original table must contain person_id or subject_id:

cdm$my_cohort |>
  addSex(
    sexName = "sex", # name of the new column (default = "sex")
    missingSexValue = "None" # label for missing gender_concept_id (default = "None")
  )

# Source:   table<og_002_1713352417> [?? x 5]
# Database: DuckDB v0.10.0 [martics@Windows 10 x64:R 4.2.1/C:\Users\martics\AppData\Local\Temp\Rtmp2dwceZ\file72183cae1e8c.duckdb]
   cohort_definition_id subject_id cohort_start_date cohort_end_date sex   
                  <int>      <int> <date>            <date>          <chr> 
 1                    1        318 1960-06-27        2018-10-02      Female
 2                    1        762 1979-01-06        2019-02-13      Male  
 3                    1       1068 1978-07-22        2018-11-02      Male  
 4                    1       1508 1941-12-12        2019-05-14      Female
 5                    1       1656 1992-07-14        2018-01-27      Female
 6                    1       2191 1971-12-13        2018-05-13      Male  
 7                    1       2597 1976-11-23        2018-12-18      Male  
 8                    1       3208 2003-09-08        2019-01-18      Female
 9                    1       3250 1977-06-07        2019-02-12      Male  
10                    1       3494 1962-02-22        2018-11-29      Female
# ℹ more rows

addAge()

To create a new column with the age of the individual. The original table must contain person_id or subject_id:

cdm$my_cohort |>
  addAge(
    indexDate = "cohort_start_date", # date to compute age (default = "cohort_start_date") 
    ageName = "age", # name of the age column (default = "age") 
    ageDefaultMonth = 1, # Month for individuals with missing month (default = 1) 
    ageDefaultDay = 1, # Day for individuals with missing day (default = 1)
    ageImposeMonth = F, # Whether to impose default month to all individuals (default = F) 
    ageImposeDay = F # Whether to impose default day to all individuals (default = F)
  )

# Source:   table<og_003_1713352418> [?? x 5]
# Database: DuckDB v0.10.0 [martics@Windows 10 x64:R 4.2.1/C:\Users\martics\AppData\Local\Temp\Rtmp2dwceZ\file72183cae1e8c.duckdb]
   cohort_definition_id subject_id cohort_start_date cohort_end_date   age
                  <int>      <int> <date>            <date>          <dbl>
 1                    1        318 1960-06-27        2018-10-02          7
 2                    1        762 1979-01-06        2019-02-13         24
 3                    1       1068 1978-07-22        2018-11-02          4
 4                    1       1508 1941-12-12        2019-05-14         17
 5                    1       1656 1992-07-14        2018-01-27         14
 6                    1       2191 1971-12-13        2018-05-13          1
 7                    1       2597 1976-11-23        2018-12-18          5
 8                    1       3208 2003-09-08        2019-01-18         35
 9                    1       3250 1977-06-07        2019-02-12         14
10                    1       3494 1962-02-22        2018-11-29          2
# ℹ more rows

addAge()

You can also add an age group:

cdm$drug_exposure |>
  addAge(
    indexDate = "drug_exposure_start_date",
    ageGroup = list(c(0, 39), c(40, Inf))
  ) |>
  glimpse()

Rows: ??
Columns: 25
Database: DuckDB v0.10.0 [martics@Windows 10 x64:R 4.2.1/C:\Users\martics\AppData\Local\Temp\Rtmp2dwceZ\file72183cae1e8c.duckdb]
$ drug_exposure_id             <int> 60926, 26418, 54785, 47027, 38712, 21897, 52764, 8667, 10459, 10524, 9646, 9653, 86272, 974…
$ person_id                    <int> 1332, 576, 4550, 3895, 3199, 476, 1154, 186, 871, 877, 806, 806, 1892, 817, 820, 1746, 2230…
$ drug_concept_id              <int> 40213198, 40213260, 1118084, 40162522, 19059056, 40213296, 40213160, 40213160, 1127078, 112…
$ drug_exposure_start_date     <date> 2010-10-06, 2017-10-25, 2000-11-13, 1982-03-30, 1975-08-19, 2008-05-31, 1971-05-16, 1976-0…
$ drug_exposure_start_datetime <dttm> 2010-10-06, 2017-10-25, 2000-11-13, 1982-03-30, 1975-08-19, 2008-05-31, 1971-05-16, 1976-0…
$ drug_exposure_end_date       <date> 2010-10-06, 2017-10-25, 2000-11-13, 1982-03-30, 1975-09-02, 2008-05-31, 1971-05-16, 1976-0…
$ drug_exposure_end_datetime   <dttm> 2010-10-06, 2017-10-25, 2000-11-13, 1982-03-30, 1975-09-02, 2008-05-31, 1971-05-16, 1976-0…
$ verbatim_end_date            <date> 2010-10-06, 2017-10-25, NA, NA, 1975-09-02, 2008-05-31, 1971-05-16, 1976-05-31, 1968-01-08…
$ drug_type_concept_id         <int> 581452, 581452, 38000177, 38000177, 38000177, 581452, 581452, 581452, 38000177, 38000177, 3…
$ stop_reason                  <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ refills                      <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ quantity                     <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ days_supply                  <int> 0, 0, 0, 0, 14, 0, 0, 0, 60, 14, 28, 7, 0, 14, 11, 7, 0, 0, 14, 0, 0, 91, 21, 0, 0, 360, 7,…
$ sig                          <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ route_concept_id             <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ lot_number                   <chr> "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "0", "…
$ provider_id                  <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ visit_occurrence_id          <int> 88400, 38145, 303185, 259023, 212733, 31764, 76224, 12657, 57955, 58440, 53468, 53505, 1260…
$ visit_detail_id              <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
$ drug_source_value            <chr> "133", "121", "00025152531", "857005", "243670", "52", "10", "10", "282464", "313782", "282…
$ drug_source_concept_id       <int> 40213198, 40213260, 44923712, 40162522, 19059056, 40213296, 40213160, 40213160, 1127078, 11…
$ route_source_value           <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ dose_unit_source_value       <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ age                          <dbl> 67, 51, 38, 58, 2, 40, 0, 4, 1, 15, 5, 52, 23, 41, 28, 58, 12, 61, 57, 34, 60, 39, 2, 36, 7…
$ age_group                    <chr> "40 or above", "40 or above", "0 to 39", "40 or above", "0 to 39", "40 or above", "0 to 39"…

addAge()

You can personalise labels:

cdm$my_cohort |>
  addAge(
    ageGroup = list(
      "age_group" = list("<40" = c(0, 39), ">=40" = c(40, Inf)),
      "category" = list("kids" = c(0, 17), "adults" = c(18, Inf))
    )
  )

# Source:   table<og_008_1713352421> [?? x 7]
# Database: DuckDB v0.10.0 [martics@Windows 10 x64:R 4.2.1/C:\Users\martics\AppData\Local\Temp\Rtmp2dwceZ\file72183cae1e8c.duckdb]
   cohort_definition_id subject_id cohort_start_date cohort_end_date   age age_group category
                  <int>      <int> <date>            <date>          <dbl> <chr>     <chr>   
 1                    1        318 1960-06-27        2018-10-02          7 <40       kids    
 2                    1        762 1979-01-06        2019-02-13         24 <40       adults  
 3                    1       1068 1978-07-22        2018-11-02          4 <40       kids    
 4                    1       1508 1941-12-12        2019-05-14         17 <40       kids    
 5                    1       1656 1992-07-14        2018-01-27         14 <40       kids    
 6                    1       2191 1971-12-13        2018-05-13          1 <40       kids    
 7                    1       2597 1976-11-23        2018-12-18          5 <40       kids    
 8                    1       3208 2003-09-08        2019-01-18         35 <40       adults  
 9                    1       3250 1977-06-07        2019-02-12         14 <40       kids    
10                    1       3494 1962-02-22        2018-11-29          2 <40       kids    
# ℹ more rows

addPriorObservation()

cdm$my_cohort |>
  addPriorObservation()

# Source:   table<og_009_1713352422> [?? x 5]
# Database: DuckDB v0.10.0 [martics@Windows 10 x64:R 4.2.1/C:\Users\martics\AppData\Local\Temp\Rtmp2dwceZ\file72183cae1e8c.duckdb]
   cohort_definition_id subject_id cohort_start_date cohort_end_date prior_observation
                  <int>      <int> <date>            <date>                      <dbl>
 1                    1        318 1960-06-27        2018-10-02                   2610
 2                    1        762 1979-01-06        2019-02-13                   8942
 3                    1       1068 1978-07-22        2018-11-02                   1772
 4                    1       1508 1941-12-12        2019-05-14                   6555
 5                    1       1656 1992-07-14        2018-01-27                   5344
 6                    1       2191 1971-12-13        2018-05-13                    603
 7                    1       2597 1976-11-23        2018-12-18                   2191
 8                    1       3208 2003-09-08        2019-01-18                  13023
 9                    1       3494 1962-02-22        2018-11-29                    931
10                    1       3495 1978-01-24        2018-11-22                   4605
# ℹ more rows

addPriorObservation()

cdm$condition_occurrence |>
  addPriorObservation(
    indexDate = "condition_start_date", 
    priorObservationName = "start_observation", # name of the column
    priorObservationType = "date" # default = "days"
  ) |>
  glimpse()

# Source:   SQL [?? x 13]
# Database: DuckDB v0.10.0 [martics@Windows 10 x64:R 4.2.1/C:\Users\martics\AppData\Local\Temp\Rtmp2dwceZ\file72183cae1e8c.duckdb]
   person_id condition_occurrence_id condition_concept_id condition_start_date condition_start_datetime condition_end_date
       <int>                   <int>                <int> <date>               <dttm>                   <date>            
 1       263                    4483              4112343 2015-10-02           2015-10-02 00:00:00      2015-10-14        
 2       273                    4657               192671 2011-10-10           2011-10-10 00:00:00      NA                
 3       283                    4815                28060 1984-02-15           1984-02-15 00:00:00      1984-02-25        
 4       293                    4981               378001 2005-11-07           2005-11-07 00:00:00      2005-12-07        
 5       304                    5153               257012 1974-07-30           1974-07-30 00:00:00      1974-11-05        
 6       334                    5655             40481087 1999-07-12           1999-07-12 00:00:00      1999-07-19        
 7       341                    5811             40481087 1990-09-14           1990-09-14 00:00:00      1990-10-05        
 8       351                    5977             40481087 1986-02-24           1986-02-24 00:00:00      1986-03-17        
 9       362                    6143              4113008 1998-07-03           1998-07-03 00:00:00      1998-07-17        
10       370                    6309               372328 1970-03-15           1970-03-15 00:00:00      1970-04-23        
# ℹ more rows
# ℹ 7 more variables: condition_end_datetime <dttm>, condition_type_concept_id <int>, condition_status_concept_id <int>,
#   condition_source_value <chr>, condition_source_concept_id <int>, condition_status_source_value <chr>,
#   start_observation <date>

addFutureObservation()

cdm$my_cohort |>
  addFutureObservation()

# Source:   table<og_011_1713352424> [?? x 5]
# Database: DuckDB v0.10.0 [martics@Windows 10 x64:R 4.2.1/C:\Users\martics\AppData\Local\Temp\Rtmp2dwceZ\file72183cae1e8c.duckdb]
   cohort_definition_id subject_id cohort_start_date cohort_end_date future_observation
                  <int>      <int> <date>            <date>                       <dbl>
 1                    1        318 1960-06-27        2018-10-02                   21281
 2                    1        762 1979-01-06        2019-02-13                   14648
 3                    1       1068 1978-07-22        2018-11-02                   14713
 4                    1       1508 1941-12-12        2019-05-14                   28277
 5                    1       1656 1992-07-14        2018-01-27                    9328
 6                    1       2191 1971-12-13        2018-05-13                   16953
 7                    1       2597 1976-11-23        2018-12-18                   15365
 8                    1       3208 2003-09-08        2019-01-18                    5611
 9                    1       3494 1962-02-22        2018-11-29                   20734
10                    1       3495 1978-01-24        2018-11-22                   14912
# ℹ more rows

addInObservation()

cdm$condition_occurrence |>
  addInObservation(indexDate = "condition_start_date") |>
  filter(in_observation == 0) |>
  select("condition_concept_id", "person_id", "condition_start_date", "in_observation")

# Source:   SQL [?? x 4]
# Database: DuckDB v0.10.0 [martics@Windows 10 x64:R 4.2.1/C:\Users\martics\AppData\Local\Temp\Rtmp2dwceZ\file72183cae1e8c.duckdb]
   condition_concept_id person_id condition_start_date in_observation
                  <int>     <int> <date>                        <dbl>
 1              4029498       539 1933-03-26                        0
 2                80180       156 2019-01-30                        0
 3              4029498       211 1967-10-12                        0
 4                80180       588 2019-06-08                        0
 5                80180       339 2019-06-21                        0
 6                80180       220 2019-05-15                        0
 7                80180       843 2017-09-04                        0
 8                80180       884 2017-09-16                        0
 9                80180      1100 2019-05-08                        0
10                80180       814 2019-05-24                        0
# ℹ more rows

addInObservation() window

cdm$my_cohort |>
  addInObservation(
    window = list("20yr" = c(7300, 7665), "40yr" = c(14600, 14965), "60yr" = c(21900, 22265)),
    completeInterval = T, 
    nameStyle = "obs_{window_name}"
  )

# Source:   table<og_015_1713352427> [?? x 7]
# Database: DuckDB v0.10.0 [martics@Windows 10 x64:R 4.2.1/C:\Users\martics\AppData\Local\Temp\Rtmp2dwceZ\file72183cae1e8c.duckdb]
   cohort_definition_id subject_id cohort_start_date cohort_end_date obs_20yr obs_40yr obs_60yr
                  <int>      <int> <date>            <date>             <dbl>    <dbl>    <dbl>
 1                    1        318 1960-06-27        2018-10-02             1        1        0
 2                    1        762 1979-01-06        2019-02-13             1        0        0
 3                    1       1068 1978-07-22        2018-11-02             1        0        0
 4                    1       1508 1941-12-12        2019-05-14             1        1        1
 5                    1       1656 1992-07-14        2018-01-27             1        0        0
 6                    1       2191 1971-12-13        2018-05-13             1        1        0
 7                    1       2597 1976-11-23        2018-12-18             1        1        0
 8                    1       3208 2003-09-08        2019-01-18             0        0        0
 9                    1       3494 1962-02-22        2018-11-29             1        1        0
10                    1       3495 1978-01-24        2018-11-22             1        0        0
# ℹ more rows

addDateOfBirth()

cdm$my_cohort |>
  addDateOfBirth()

# Source:   SQL [?? x 5]
# Database: DuckDB v0.10.0 [martics@Windows 10 x64:R 4.2.1/C:\Users\martics\AppData\Local\Temp\Rtmp2dwceZ\file72183cae1e8c.duckdb]
   cohort_definition_id subject_id cohort_start_date cohort_end_date date_of_birth
                  <int>      <int> <date>            <date>          <date>       
 1                    1        318 1960-06-27        2018-10-02      1953-05-04   
 2                    1        762 1979-01-06        2019-02-13      1954-07-14   
 3                    1       1068 1978-07-22        2018-11-02      1973-09-14   
 4                    1       1508 1941-12-12        2019-05-14      1924-01-01   
 5                    1       1656 1992-07-14        2018-01-27      1977-11-26   
 6                    1       2191 1971-12-13        2018-05-13      1970-04-19   
 7                    1       2597 1976-11-23        2018-12-18      1970-11-24   
 8                    1       3208 2003-09-08        2019-01-18      1968-01-12   
 9                    1       3250 1977-06-07        2019-02-12      1963-01-08   
10                    1       3494 1962-02-22        2018-11-29      1959-08-06   
# ℹ more rows

addDemographics()

cdm$my_cohort |>
  addDemographics()

# Source:   table<og_016_1713352428> [?? x 8]
# Database: DuckDB v0.10.0 [martics@Windows 10 x64:R 4.2.1/C:\Users\martics\AppData\Local\Temp\Rtmp2dwceZ\file72183cae1e8c.duckdb]
   cohort_definition_id subject_id cohort_start_date cohort_end_date   age sex    prior_observation future_observation
                  <int>      <int> <date>            <date>          <dbl> <chr>              <dbl>              <dbl>
 1                    1        318 1960-06-27        2018-10-02          7 Female              2610              21281
 2                    1        762 1979-01-06        2019-02-13         24 Male                8942              14648
 3                    1       1068 1978-07-22        2018-11-02          4 Male                1772              14713
 4                    1       1508 1941-12-12        2019-05-14         17 Female              6555              28277
 5                    1       1656 1992-07-14        2018-01-27         14 Female              5344               9328
 6                    1       2191 1971-12-13        2018-05-13          1 Male                 603              16953
 7                    1       2597 1976-11-23        2018-12-18          5 Male                2191              15365
 8                    1       3208 2003-09-08        2019-01-18         35 Female             13023               5611
 9                    1       3494 1962-02-22        2018-11-29          2 Female               931              20734
10                    1       3495 1978-01-24        2018-11-22         12 Female              4605              14912
# ℹ more rows

addDemographics()

cdm$my_cohort |>
  addDemographics(
    age = TRUE,
    ageGroup = list("kids" = c(0, 17), "adults" = c(18, Inf)),
    sex = FALSE,
    priorObservation = TRUE,
    priorObservationName = "observation_start",
    priorObservationType = "date",
    futureObservation = TRUE,
    futureObservationName = "observation_end",
    futureObservationType = "date"
  )

# Source:   table<og_018_1713352430> [?? x 8]
# Database: DuckDB v0.10.0 [martics@Windows 10 x64:R 4.2.1/C:\Users\martics\AppData\Local\Temp\Rtmp2dwceZ\file72183cae1e8c.duckdb]
   cohort_definition_id subject_id cohort_start_date cohort_end_date   age observation_start observation_end age_group
                  <int>      <int> <date>            <date>          <dbl> <date>            <date>          <chr>    
 1                    1        318 1960-06-27        2018-10-02          7 1953-05-05        2018-10-02      kid      
 2                    1        762 1979-01-06        2019-02-13         24 1954-07-14        2019-02-13      adult    
 3                    1       1068 1978-07-22        2018-11-02          4 1973-09-14        2018-11-02      kid      
 4                    1       1508 1941-12-12        2019-05-14         17 1924-01-01        2019-05-14      kid      
 5                    1       1656 1992-07-14        2018-01-27         14 1977-11-26        2018-01-27      kid      
 6                    1       2191 1971-12-13        2018-05-13          1 1970-04-19        2018-05-13      kid      
 7                    1       2597 1976-11-23        2018-12-18          5 1970-11-24        2018-12-18      kid      
 8                    1       3208 2003-09-08        2019-01-18         35 1968-01-12        2019-01-18      adult    
 9                    1       3494 1962-02-22        2018-11-29          2 1959-08-06        2018-11-29      kid      
10                    1       3495 1978-01-24        2018-11-22         12 1965-06-16        2018-11-22      kid      
# ℹ more rows

add intersections overview

origin table

indexDate Column that indicates the “origin” date.
window Window list to specify the interest interval from the indexDate.
censorDate Column that indicates the “end” of followup.

add intersections overview

target

Cohort: targetCohortTable + targetCohortId + (targetStartDate)
Concept: conceptSet + (targetStartDate)
Table: tableName + (targetStartDate)

add intersections overview

Estimate

flag: NA, 0, 1 (extra argument: targetEndDate)
count: NA/integer (extra argument: targetEndDate)
date: NA/date (extra argument: order)
days: NA/integer (extra argument: order)

12 functions

addCohortIntersectFlag

cdm$my_cohort |>
  addCohortIntersectFlag(
    targetCohortTable = "drugs", 
    window = list(c(1, 180), c(181, 365))
  )

# Source:   table<og_034_1713352437> [?? x 8]
# Database: DuckDB v0.10.0 [martics@Windows 10 x64:R 4.2.1/C:\Users\martics\AppData\Local\Temp\Rtmp2dwceZ\file72183cae1e8c.duckdb]
   cohort_definition_id subject_id cohort_start_date cohort_end_date aspirin_1_to_180 amoxicillin_1_to_180 amoxicillin_181_to_365
                  <int>      <int> <date>            <date>                     <dbl>                <dbl>                  <dbl>
 1                    1       2191 1971-12-13        2018-05-13                     0                    0                      1
 2                    1       3537 1989-10-30        2019-05-19                     0                    1                      0
 3                    1       4283 1959-02-04        2019-01-26                     1                    0                      0
 4                    1         78 1967-02-07        2018-12-13                     1                    0                      0
 5                    1       2164 1973-01-31        2019-01-04                     0                    0                      0
 6                    1       5065 1970-12-04        2005-03-23                     1                    0                      0
 7                    1       5343 1987-12-09        2018-03-11                     0                    1                      1
 8                    1       4602 1996-06-16        2017-07-31                     0                    1                      1
 9                    1       1217 1973-06-12        2019-04-15                     1                    0                      0
10                    1       2082 1944-05-27        2019-05-16                     0                    0                      0
# ℹ more rows
# ℹ 1 more variable: aspirin_181_to_365 <dbl>

addTableIntersectCount

cdm$my_cohort |>
  addTableIntersectCount(
    tableName = "drug_exposure",
    window = c(0, 365), 
    targetEndDate = NULL,
    nameStyle = "number_prescriptions"
  )

# Source:   table<og_043_1713352439> [?? x 5]
# Database: DuckDB v0.10.0 [martics@Windows 10 x64:R 4.2.1/C:\Users\martics\AppData\Local\Temp\Rtmp2dwceZ\file72183cae1e8c.duckdb]
   cohort_definition_id subject_id cohort_start_date cohort_end_date number_prescriptions
                  <int>      <int> <date>            <date>                         <dbl>
 1                    1       1068 1978-07-22        2018-11-02                         1
 2                    1       2191 1971-12-13        2018-05-13                         1
 3                    1       3537 1989-10-30        2019-05-19                         2
 4                    1       4283 1959-02-04        2019-01-26                         1
 5                    1       5002 1977-04-06        2019-05-26                         4
 6                    1         78 1967-02-07        2018-12-13                         7
 7                    1       1535 1983-08-09        2018-04-09                         1
 8                    1       2164 1973-01-31        2019-01-04                         4
 9                    1       2757 1977-12-11        2018-08-04                         2
10                    1       5065 1970-12-04        2005-03-23                         2
# ℹ more rows

addCohortIntersectDate

cdm$my_cohort |>
  addCohortIntersectDate(
    targetCohortTable = "outcome",
    window = c(0, Inf),
    censor = "cohort_end_date",
    nameStyle = "next_{cohort_name}"
  )

# Source:   table<og_050_1713352442> [?? x 6]
# Database: DuckDB v0.10.0 [martics@Windows 10 x64:R 4.2.1/C:\Users\martics\AppData\Local\Temp\Rtmp2dwceZ\file72183cae1e8c.duckdb]
   cohort_definition_id subject_id cohort_start_date cohort_end_date next_covid next_vaccine
                  <int>      <int> <date>            <date>          <date>     <date>      
 1                    1       1068 1978-07-22        2018-11-02      1992-03-14 NA          
 2                    1       1508 1941-12-12        2019-05-14      1980-07-04 1943-01-12  
 3                    1       2191 1971-12-13        2018-05-13      1972-09-13 NA          
 4                    1       2597 1976-11-23        2018-12-18      1990-12-15 NA          
 5                    1       3208 2003-09-08        2019-01-18      2016-02-09 NA          
 6                    1       3494 1962-02-22        2018-11-29      NA         1984-01-27  
 7                    1       3495 1978-01-24        2018-11-22      1996-02-21 NA          
 8                    1       3537 1989-10-30        2019-05-19      1989-11-23 NA          
 9                    1       3560 1985-04-27        2018-06-28      NA         1988-04-12  
10                    1       4283 1959-02-04        2019-01-26      NA         1959-02-08  
# ℹ more rows

addConceptIntersectDays

cdm$my_cohort |>
  addConceptIntersectDays(
    conceptSet = getDrugIngredientCodes(cdm = cdm),
    window = c(0, 365),
    nameStyle = "next_{concept_name}"
  ) |>
  glimpse()

Rows: ??
Columns: 95
Database: DuckDB v0.10.0 [martics@Windows 10 x64:R 4.2.1/C:\Users\martics\AppData\Local\Temp\Rtmp2dwceZ\file72183cae1e8c.duckdb]
$ cohort_definition_id     <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,…
$ subject_id               <int> 2191, 3537, 4270, 4283, 5002, 78, 2164, 2757, 5065, 5343, 279, 2152, 2402, 3364, 219, 411, 1459…
$ cohort_start_date        <date> 1971-12-13, 1989-10-30, 1991-03-15, 1959-02-04, 1977-04-06, 1967-02-07, 1973-01-31, 1977-12-11…
$ cohort_end_date          <date> 2018-05-13, 2019-05-19, 2019-01-29, 2019-01-26, 2019-05-26, 2018-12-13, 2019-01-04, 2018-08-04…
$ next_diclofenac          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_ampicillin          <dbl> NA, NA, NA, NA, 160, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ next_astemizole          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_carbamazepine       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_penicillin_g        <dbl> NA, NA, NA, NA, NA, 147, NA, NA, NA, NA, 132, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
$ next_oxycodone           <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_nitrofurantoin      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_ferrous_fumarate    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_clavulanate         <dbl> 275, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 178, NA, NA, N…
$ next_mestranol           <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_doxylamine          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_chlorpheniramine    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 101, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ next_prednisone          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_cefuroxime          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_simvastatin         <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_atorvastatin        <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_acetaminophen       <dbl> NA, 24, NA, NA, 160, NA, NA, 85, NA, 171, NA, NA, 222, NA, NA, 314, NA, NA, 220, 267, NA, 326, …
$ next_penicillin_v        <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 44, NA, NA, NA, 0, 75, 0, NA, NA, NA, NA, NA, NA, N…
$ next_aspirin             <dbl> NA, NA, NA, 4, NA, 147, 206, NA, 179, NA, 132, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 57, …
$ next_norethindrone       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_inert_ingredients   <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_diphenhydramine     <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 88,…
$ next_hydrocodone         <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_amlodipine          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_nitroglycerin       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_atropine            <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_dextromethorphan    <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_diazepam            <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_albuterol           <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 230, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ next_methylphenidate     <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_fluticasone         <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 230, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ next_clopidogrel         <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_cefaclor            <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 149, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ next_naproxen            <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_amoxicillin         <dbl> 275, 24, NA, NA, NA, NA, NA, NA, NA, 171, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 178, NA, NA, …
$ next_norgestimate        <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_celecoxib           <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_epinephrine         <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_doxycycline         <dbl> NA, NA, NA, NA, NA, NA, 206, NA, 179, NA, NA, NA, 222, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
$ next_meperidine          <dbl> NA, NA, 298, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ next_salmeterol          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_ibuprofen           <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 149, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
$ next_ethinyl_estradiol   <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_phenazopyridine     <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_hydrocortisone      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_ondansetron         <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_loratadine          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_morphine            <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_dornase_alfa        <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_cetirizine          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_terfenadine         <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_fexofenadine        <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_fentanyl            <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_methotrexate        <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_verapamil           <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_vitamin_b_12        <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_amiodarone          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_warfarin            <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_digoxin             <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_alteplase           <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_heparin             <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_ferrous_sulfate     <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_medroxyprogesterone <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_levothyroxine       <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_drospirenone        <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_norelgestromin      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_etonogestrel        <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_estradiol           <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_alendronate         <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_levonorgestrel      <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_tazobactam          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_piperacillin        <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_desflurane          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_rocuronium          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_cyclosporine        <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_remifentanil        <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_sevoflurane         <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_dienogest           <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_alfentanil          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_sufentanil          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_memantine           <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_midazolam           <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_donepezil           <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_atomoxetine         <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_propofol            <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_galantamine         <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_isoflurane          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_lorazepam           <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_tacrine             <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_pancreatin          <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ next_sodium_chloride     <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…

Analysis example

x <- cdm$covid_cohort |>
  addDemographics() |>
  mutate(future_observation = if_else(future_observation > 180, 180, future_observation)) |>
  mutate(cohort_end_date = as.Date(!!dateadd("cohort_start_date", "future_observation"))) |>
  addCohortIntersectDays(
    targetCohortTable = outcomeCohort,
    window = list(c(1, Inf)),
    censorDate = "cohort_end_date",
    nameStyle = "{cohort_name}"
  ) |>
  addCohortIntersectFlag(
    targetCohortTable = outcomeCohort,
    window = list(c(-180, 0)),
    nameStyle = "washout_{cohort_name}"
  ) |>
  addCohortIntersectFlag(
    targetCohortTable = conditionsCohort,
    window = list("short" = c(-30, -1), "mid" = c(-365, -1), "any" = c(-Inf, -1)),
    nameStyle = "{window_name}_{cohort_name}"
  ) |>
  addCohortIntersectFlag(
    targetCohortTable = medicationsCohort,
    targetCohortId = getId(cdm[[medicationsCohort]], c("glucocorticoids", "antithromb")),
    window = list("shortmed" = c(-30, -1), "midmed" = c(-183, -1)),
    nameStyle = "{window_name}_{cohort_name}"
  ) |>
  mutate(across(
    cohortSet(cdm[[outcomeCohort]])$cohort_name,
    ~ if_else(!is.na(.x), 1, 0),
    .names = "status_{.col}"
  )) |>
  mutate(across(
    cohortSet(cdm[[outcomeCohort]])$cohort_name,
    ~ if_else(!is.na(.x), .x, future_observation),
    .names = "time_{.col}"
  )) |>
  collect()

-> Apply outcome model

Summarise data

x <- cdm$my_cohort |>
  addConceptIntersectFlag(
    conceptSet = list("ibuprofen" = c(19019979, 19078461, 1177480)), 
    window = c(-Inf, 0), 
    nameStyle = "prior_ibuprofen"
  ) |>
  addTableIntersectCount(
    window = c(-Inf, Inf),
    tableName = "condition_occurrence", 
    nameStyle = "number_conditions"
  ) |>
  addDemographics()

Summarise data

# Source:   table<og_083_1713352454> [?? x 10]
# Database: DuckDB v0.10.0 [martics@Windows 10 x64:R 4.2.1/C:\Users\martics\AppData\Local\Temp\Rtmp2dwceZ\file72183cae1e8c.duckdb]
   cohort_definition_id subject_id cohort_start_date cohort_end_date prior_ibuprofen number_conditions   age sex   
                  <int>      <int> <date>            <date>                    <dbl>             <dbl> <dbl> <chr> 
 1                    1       1656 1992-07-14        2018-01-27                    1                 8    14 Female
 2                    1       3208 2003-09-08        2019-01-18                    1                14    35 Female
 3                    1       3537 1989-10-30        2019-05-19                    1                24    28 Male  
 4                    1       5002 1977-04-06        2019-05-26                    1                10     0 Male  
 5                    1       2757 1977-12-11        2018-08-04                    1                12     4 Male  
 6                    1       5343 1987-12-09        2018-03-11                    1                22     4 Female
 7                    1        443 2016-11-06        2018-12-18                    1                22    47 Male  
 8                    1       3129 2003-03-02        2019-06-12                    1                19    31 Female
 9                    1       3595 1996-08-09        2019-02-28                    1                20    47 Male  
10                    1        563 2002-03-14        2005-03-20                    1                18    37 Male  
# ℹ more rows
# ℹ 2 more variables: prior_observation <dbl>, future_observation <dbl>

Summarise data

x |>
  group_by(sex, prior_ibuprofen) |>
  summarise(
    mean_conditions = mean(number_conditions),
    mean_age = mean(age),
    mean_followup = mean(future_observation),
    .groups = "drop"
  ) |>
  collect()

# A tibble: 4 × 5
  sex    prior_ibuprofen mean_conditions mean_age mean_followup
  <chr>            <dbl>           <dbl>    <dbl>         <dbl>
1 Female               1            18.0     21.5         8916.
2 Male                 0            21.4     11.4        17943.
3 Male                 1            17.3     22.5         9046.
4 Female               0            22.1     11.1        18240.

Summarise data

summariseResult(
  table = x, # table to summarise 
  strata = list("sex", c("sex", "prior_ibuprofen")), # strata
  includeOverallStrata = TRUE,
  variables = list(
    c("number_conditions", "age", "future_observation"),
    c("sex")
  ), 
  estimates = list(
    c("median", "q25", "q75"),
    c("count", "percentage")
  )
)

Summarise data

# A tibble: 93 × 16
   result_id cdm_name       result_type package_name package_version group_name group_level strata_name strata_level variable_name
       <int> <chr>          <chr>       <chr>        <chr>           <chr>      <chr>       <chr>       <chr>        <chr>        
 1         1 Synthea synth… summarise_… PatientProf… 0.7.0           overall    overall     overall     overall      number recor…
 2         1 Synthea synth… summarise_… PatientProf… 0.7.0           overall    overall     overall     overall      number subje…
 3         1 Synthea synth… summarise_… PatientProf… 0.7.0           overall    overall     overall     overall      number_condi…
 4         1 Synthea synth… summarise_… PatientProf… 0.7.0           overall    overall     overall     overall      number_condi…
 5         1 Synthea synth… summarise_… PatientProf… 0.7.0           overall    overall     overall     overall      number_condi…
 6         1 Synthea synth… summarise_… PatientProf… 0.7.0           overall    overall     overall     overall      age          
 7         1 Synthea synth… summarise_… PatientProf… 0.7.0           overall    overall     overall     overall      age          
 8         1 Synthea synth… summarise_… PatientProf… 0.7.0           overall    overall     overall     overall      age          
 9         1 Synthea synth… summarise_… PatientProf… 0.7.0           overall    overall     overall     overall      sex          
10         1 Synthea synth… summarise_… PatientProf… 0.7.0           overall    overall     overall     overall      sex          
# ℹ 83 more rows
# ℹ 6 more variables: variable_level <chr>, estimate_name <chr>, estimate_type <chr>, estimate_value <chr>,
#   additional_name <chr>, additional_level <chr>

Overview

Used internally in other packages (DrugUtilisation, CohortSurvival, …)
Used in complex study
Not needed for Off The Shelf Studies

CohortCharacteristics

summariseCharacteristics
summariseLargeScaleCharacteristics
summariseCohortOverlap
summariseCohortTiming
Each function has an associated table and plot function

summariseCharacteristics

cdm$my_cohort |>
  addSex() |>
  summariseCharacteristics(
    strata = "sex",
    demographics = TRUE,
    ageGroup = list(c(0, 19), c(20, 39), c(40, 59), c(60, 79), c(80, Inf)),
    tableIntersect = list(
      "Number of visits prior year" = list(
        tableName = "visit_occurrence", value = "count", window = c(-365, 0)
      )
    ),
    cohortIntersect = list(
      "Conditions any time prior" = list(
        targetCohortTable = "conditions", value = "flag", window = c(-Inf, 0)
      ),
      "Medications prior year" = list(
        targetCohortTable = "medications", value = "flag", window = c(-365, 0)
      )
    ),
    conceptIntersect = list() 
  )

summariseCharacteristics

Rows: 260
Columns: 16
$ result_id        <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ cdm_name         <chr> "Synthea synthetic health database", "Synthea synthetic health database", "Synthea synthetic health dat…
$ result_type      <chr> "summarised_characteristics", "summarised_characteristics", "summarised_characteristics", "summarised_c…
$ package_name     <chr> "PatientProfiles", "PatientProfiles", "PatientProfiles", "PatientProfiles", "PatientProfiles", "Patient…
$ package_version  <chr> "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8…
$ group_name       <chr> "cohort_name", "cohort_name", "cohort_name", "cohort_name", "cohort_name", "cohort_name", "cohort_name"…
$ group_level      <chr> "viral_pharyngitis", "viral_pharyngitis", "viral_pharyngitis", "viral_pharyngitis", "viral_pharyngitis"…
$ strata_name      <chr> "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "ove…
$ strata_level     <chr> "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "ove…
$ variable_name    <chr> "Number records", "Number subjects", "Cohort start date", "Cohort start date", "Cohort start date", "Co…
$ variable_level   <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "Fe…
$ estimate_name    <chr> "count", "count", "min", "q05", "q25", "median", "q75", "q95", "max", "min", "q05", "q25", "median", "q…
$ estimate_type    <chr> "integer", "integer", "date", "date", "date", "date", "date", "date", "date", "date", "date", "date", "…
$ estimate_value   <chr> "2606", "2606", "1909-09-15", "1936-02-29", "1959-06-14", "1972-02-16", "1983-09-04", "2003-11-29", "20…
$ additional_name  <chr> "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "ove…
$ additional_level <chr> "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "ove…

tableCharacteristics

result |>
  tableCharacteristics()

CDM name	Sex	Variable name	Variable level	Estimate name	Cohort name
CDM name	Sex	Variable name	Variable level	Estimate name	Viral pharyngitis
Synthea synthetic health database	Overall	Number records	-	N	2,606
		Number subjects	-	N	2,606
		Cohort start date	-	Median [Q25 - Q75]	1972-02-16 [1959-06-14 - 1983-09-04]
				[Q05 - Q95]	[1936-02-29 - 2003-11-29]
				Range	1909-09-15 to 2019-06-02
		Cohort end date	-	Median [Q25 - Q75]	2018-12-16 [2018-08-05 - 2019-04-07]
				[Q05 - Q95]	[2009-05-06 - 2019-06-15]
				Range	1961-02-26 to 2019-07-03
		Age	-	Median [Q25 - Q75]	8.00 [3.00 - 18.00]
				[Q05 - Q95]	[0.00 - 39.00]
				Mean (SD)	12.55 (12.58)
				Range	0.00 to 77.00
		Sex	Female	N (%)	1,322 (50.7%)
			Male	N (%)	1,284 (49.3%)
		Prior observation	-	Median [Q25 - Q75]	3,196.00 [1,381.00 - 6,694.00]
				[Q05 - Q95]	[273.25 - 14,437.50]
				Mean (SD)	4,766.35 (4,597.04)
				Range	31.00 to 28,275.00
		Future observation	-	Median [Q25 - Q75]	16,859.50 [12,641.00 - 21,337.75]
				[Q05 - Q95]	[5,239.50 - 28,545.25]
				Mean (SD)	16,973.36 (6,994.18)
				Range	0.00 to 39,757.00
		Age group	0 to 19	N (%)	2,022 (77.6%)
			20 to 39	N (%)	459 (17.6%)
			40 to 59	N (%)	111 (4.3%)
			60 to 79	N (%)	14 (0.5%)
		Number of visits prior year	-	Median [Q25 - Q75]	0.00 [0.00 - 0.00]
				[Q05 - Q95]	[0.00 - 0.00]
				Mean (SD)	0.00 (0.06)
				Range	0.00 to 1.00
		Conditions any time prior	Asthma	N (%)	73 (2.8%)
			Myocardial infarction	N (%)	<5 (<5%)
			Infection	N (%)	2,606 (100.0%)
			Fracture	N (%)	572 (21.9%)
			Allergy	N (%)	162 (6.2%)
			Pneumonia	N (%)	0 (0.0%)
		Medications prior year	Respiratory system	N (%)	83 (3.2%)
			Nervous system	N (%)	487 (18.7%)
			Antineoplastic and immunomodulating agents	N (%)	14 (0.5%)
			Musculoskeletal system	N (%)	48 (1.8%)
			Dermatologicals	N (%)	7 (0.3%)
			Antiinfectives for systemic use	N (%)	425 (16.3%)
	Female	Number records	-	N	1,322
	Male	Number records	-	N	1,284
	Female	Number subjects	-	N	1,322
	Male	Number subjects	-	N	1,284
	Female	Cohort start date	-	Median [Q25 - Q75]	1971-08-16 [1959-07-22 - 1983-04-09]
				[Q05 - Q95]	[1932-08-01 - 2003-09-06]
				Range	1909-09-15 to 2018-10-21
	Male	Cohort start date	-	Median [Q25 - Q75]	1972-07-21 [1959-05-17 - 1983-12-22]
				[Q05 - Q95]	[1939-03-19 - 2004-03-15]
				Range	1913-02-25 to 2019-06-02
	Female	Cohort end date	-	Median [Q25 - Q75]	2018-12-20 [2018-08-18 - 2019-04-07]
				[Q05 - Q95]	[2007-12-27 - 2019-06-17]
				Range	1961-02-26 to 2019-07-01
	Male	Cohort end date	-	Median [Q25 - Q75]	2018-12-10 [2018-07-27 - 2019-04-04]
				[Q05 - Q95]	[2011-01-02 - 2019-06-13]
				Range	1967-02-18 to 2019-07-03
	Female	Age	-	Median [Q25 - Q75]	8.00 [3.00 - 17.00]
				[Q05 - Q95]	[0.00 - 39.95]
				Mean (SD)	12.26 (12.45)
				Range	0.00 to 70.00
	Male	Age	-	Median [Q25 - Q75]	9.00 [3.00 - 19.00]
				[Q05 - Q95]	[0.00 - 38.85]
				Mean (SD)	12.84 (12.70)
				Range	0.00 to 77.00
	Female	Sex	Female	N (%)	1,322 (100.0%)
	Male	Sex	Male	N (%)	1,284 (100.0%)
	Female	Prior observation	-	Median [Q25 - Q75]	3,079.50 [1,353.25 - 6,506.50]
				[Q05 - Q95]	[305.35 - 14,609.55]
				Mean (SD)	4,659.20 (4,555.49)
				Range	31.00 to 25,874.00
	Male	Prior observation	-	Median [Q25 - Q75]	3,333.00 [1,384.00 - 7,010.00]
				[Q05 - Q95]	[271.00 - 14,250.90]
				Mean (SD)	4,876.67 (4,638.62)
				Range	33.00 to 28,275.00
	Female	Future observation	-	Median [Q25 - Q75]	17,136.00 [12,804.75 - 21,375.00]
				[Q05 - Q95]	[5,251.60 - 29,126.15]
				Mean (SD)	17,182.27 (7,106.33)
				Range	1.00 to 39,757.00
	Male	Future observation	-	Median [Q25 - Q75]	16,637.50 [12,519.50 - 21,257.25]
				[Q05 - Q95]	[5,243.30 - 27,830.75]
				Mean (SD)	16,758.27 (6,872.92)
				Range	0.00 to 38,523.00
	Female	Age group	0 to 19	N (%)	1,041 (78.7%)
	Male	Age group	0 to 19	N (%)	981 (76.4%)
	Female	Age group	20 to 39	N (%)	214 (16.2%)
	Male	Age group	20 to 39	N (%)	245 (19.1%)
	Female	Age group	40 to 59	N (%)	61 (4.6%)
	Male	Age group	40 to 59	N (%)	50 (3.9%)
	Female	Age group	60 to 79	N (%)	6 (0.5%)
	Male	Age group	60 to 79	N (%)	8 (0.6%)
	Female	Number of visits prior year	-	Median [Q25 - Q75]	0.00 [0.00 - 0.00]
				[Q05 - Q95]	[0.00 - 0.00]
				Mean (SD)	0.00 (0.05)
				Range	0.00 to 1.00
	Male	Number of visits prior year	-	Median [Q25 - Q75]	0.00 [0.00 - 0.00]
				[Q05 - Q95]	[0.00 - 0.00]
				Mean (SD)	0.00 (0.07)
				Range	0.00 to 1.00
	Female	Conditions any time prior	Asthma	N (%)	38 (2.9%)
	Male	Conditions any time prior	Asthma	N (%)	35 (2.7%)
	Female	Conditions any time prior	Myocardial infarction	N (%)	0 (0.0%)
	Male	Conditions any time prior	Myocardial infarction	N (%)	<5 (<5%)
	Female	Conditions any time prior	Infection	N (%)	1,322 (100.0%)
	Male	Conditions any time prior	Infection	N (%)	1,284 (100.0%)
	Female	Conditions any time prior	Fracture	N (%)	273 (20.7%)
	Male	Conditions any time prior	Fracture	N (%)	299 (23.3%)
	Female	Conditions any time prior	Allergy	N (%)	84 (6.4%)
	Male	Conditions any time prior	Allergy	N (%)	78 (6.1%)
	Female	Conditions any time prior	Pneumonia	N (%)	0 (0.0%)
	Male	Conditions any time prior	Pneumonia	N (%)	0 (0.0%)
	Female	Medications prior year	Respiratory system	N (%)	47 (3.6%)
	Male	Medications prior year	Respiratory system	N (%)	36 (2.8%)
	Female	Medications prior year	Nervous system	N (%)	251 (19.0%)
	Male	Medications prior year	Nervous system	N (%)	236 (18.4%)
	Female	Medications prior year	Antineoplastic and immunomodulating agents	N (%)	9 (0.7%)
	Male	Medications prior year	Antineoplastic and immunomodulating agents	N (%)	5 (0.4%)
	Female	Medications prior year	Musculoskeletal system	N (%)	25 (1.9%)
	Male	Medications prior year	Musculoskeletal system	N (%)	23 (1.8%)
	Female	Medications prior year	Dermatologicals	N (%)	<5 (<5%)
	Male	Medications prior year	Dermatologicals	N (%)	5 (0.4%)
	Female	Medications prior year	Antiinfectives for systemic use	N (%)	224 (16.9%)
	Male	Medications prior year	Antiinfectives for systemic use	N (%)	201 (15.7%)

tableCharacteristics

result |>
  tableCharacteristics(
    header = c("strata"),
    formatEstimateName = c(
      "N(%)" = "<count> (<percentage>%)", 
      "median [IQR]" = "<median> [<q25> - <q75>]"
    ), 
    excludeColumns = c(
      "cdm_name", "result_id", "result_type", "package_name", "package_version", 
      "estimate_type", "additional_name", "additional_level", "cohort_name"
    ),
    .options = list(keepNotFormatted = FALSE)
  )

tableCharacteristics

Variable name	Variable level	Estimate name	Sex
Variable name	Variable level	Estimate name	Overall	Female	Male
Cohort start date	-	median [IQR]	1972-02-16 [1959-06-14 - 1983-09-04]	1971-08-16 [1959-07-22 - 1983-04-09]	1972-07-21 [1959-05-17 - 1983-12-22]
Cohort end date	-	median [IQR]	2018-12-16 [2018-08-05 - 2019-04-07]	2018-12-20 [2018-08-18 - 2019-04-07]	2018-12-10 [2018-07-27 - 2019-04-04]
Age	-	median [IQR]	8.00 [3.00 - 18.00]	8.00 [3.00 - 17.00]	9.00 [3.00 - 19.00]
Sex	Female	N(%)	1,322 (50.7%)	1,322 (100.0%)	-
	Male	N(%)	1,284 (49.3%)	-	1,284 (100.0%)
Prior observation	-	median [IQR]	3,196.00 [1,381.00 - 6,694.00]	3,079.50 [1,353.25 - 6,506.50]	3,333.00 [1,384.00 - 7,010.00]
Future observation	-	median [IQR]	16,859.50 [12,641.00 - 21,337.75]	17,136.00 [12,804.75 - 21,375.00]	16,637.50 [12,519.50 - 21,257.25]
Age group	0 to 19	N(%)	2,022 (77.6%)	1,041 (78.7%)	981 (76.4%)
	20 to 39	N(%)	459 (17.6%)	214 (16.2%)	245 (19.1%)
	40 to 59	N(%)	111 (4.3%)	61 (4.6%)	50 (3.9%)
	60 to 79	N(%)	14 (0.5%)	6 (0.5%)	8 (0.6%)
Number of visits prior year	-	median [IQR]	0.00 [0.00 - 0.00]	0.00 [0.00 - 0.00]	0.00 [0.00 - 0.00]
Conditions any time prior	Asthma	N(%)	73 (2.8%)	38 (2.9%)	35 (2.7%)
	Myocardial infarction	N(%)	<5 (<5%)	0 (0.0%)	<5 (<5%)
	Infection	N(%)	2,606 (100.0%)	1,322 (100.0%)	1,284 (100.0%)
	Fracture	N(%)	572 (21.9%)	273 (20.7%)	299 (23.3%)
	Allergy	N(%)	162 (6.2%)	84 (6.4%)	78 (6.1%)
	Pneumonia	N(%)	0 (0.0%)	0 (0.0%)	0 (0.0%)
Medications prior year	Respiratory system	N(%)	83 (3.2%)	47 (3.6%)	36 (2.8%)
	Nervous system	N(%)	487 (18.7%)	251 (19.0%)	236 (18.4%)
	Antineoplastic and immunomodulating agents	N(%)	14 (0.5%)	9 (0.7%)	5 (0.4%)
	Musculoskeletal system	N(%)	48 (1.8%)	25 (1.9%)	23 (1.8%)
	Dermatologicals	N(%)	7 (0.3%)	<5 (<5%)	5 (0.4%)
	Antiinfectives for systemic use	N(%)	425 (16.3%)	224 (16.9%)	201 (15.7%)

tableDemographics

Variable name	Variable level	Estimate name	Sex
Variable name	Variable level	Estimate name	Overall	Female	Male
Age	-	median [IQR]	8.00 [3.00 - 18.00]	8.00 [3.00 - 17.00]	9.00 [3.00 - 19.00]
Sex	Female	N(%)	1,322 (50.7%)	1,322 (100.0%)	-
	Male	N(%)	1,284 (49.3%)	-	1,284 (100.0%)
Prior observation	-	median [IQR]	3,196.00 [1,381.00 - 6,694.00]	3,079.50 [1,353.25 - 6,506.50]	3,333.00 [1,384.00 - 7,010.00]
Future observation	-	median [IQR]	16,859.50 [12,641.00 - 21,337.75]	17,136.00 [12,804.75 - 21,375.00]	16,637.50 [12,519.50 - 21,257.25]
Age group	0 to 19	N(%)	2,022 (77.6%)	1,041 (78.7%)	981 (76.4%)
	20 to 39	N(%)	459 (17.6%)	214 (16.2%)	245 (19.1%)
	40 to 59	N(%)	111 (4.3%)	61 (4.6%)	50 (3.9%)
	60 to 79	N(%)	14 (0.5%)	6 (0.5%)	8 (0.6%)

plotDemographics

tableCohortIntersect

Variable name	Variable level	Estimate name	Sex
Variable name	Variable level	Estimate name	Overall	Female	Male
Conditions any time prior	Asthma	N(%)	73 (2.8%)	38 (2.9%)	35 (2.7%)
	Myocardial infarction	N(%)	<5 (<5%)	0 (0.0%)	<5 (<5%)
	Infection	N(%)	2,606 (100.0%)	1,322 (100.0%)	1,284 (100.0%)
	Fracture	N(%)	572 (21.9%)	273 (20.7%)	299 (23.3%)
	Allergy	N(%)	162 (6.2%)	84 (6.4%)	78 (6.1%)
	Pneumonia	N(%)	0 (0.0%)	0 (0.0%)	0 (0.0%)
Medications prior year	Respiratory system	N(%)	83 (3.2%)	47 (3.6%)	36 (2.8%)
	Nervous system	N(%)	487 (18.7%)	251 (19.0%)	236 (18.4%)
	Antineoplastic and immunomodulating agents	N(%)	14 (0.5%)	9 (0.7%)	5 (0.4%)
	Musculoskeletal system	N(%)	48 (1.8%)	25 (1.9%)	23 (1.8%)
	Dermatologicals	N(%)	7 (0.3%)	<5 (<5%)	5 (0.4%)
	Antiinfectives for systemic use	N(%)	425 (16.3%)	224 (16.9%)	201 (15.7%)

plotCohortIntersect

summariseLargeScaleCharacterisation

result <- cdm$my_cohort |>
  summariseLargeScaleCharacteristics(
    window = list(c(-365, -1), c(0, 0), c(1, 365)),
    eventInWindow = "condition_occurrence",
    episodeInWindow = "drug_exposure"
  )
result |> glimpse()

Rows: 156
Columns: 16
$ result_id        <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ cdm_name         <chr> "Synthea synthetic health database", "Synthea synthetic health database", "Synthea synthetic health dat…
$ result_type      <chr> "summarised_large_scale_characteristics", "summarised_large_scale_characteristics", "summarised_large_s…
$ package_name     <chr> "PatientProfiles", "PatientProfiles", "PatientProfiles", "PatientProfiles", "PatientProfiles", "Patient…
$ package_version  <chr> "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8…
$ group_name       <chr> "overall", "overall", "overall", "cohort_name", "cohort_name", "cohort_name", "cohort_name", "cohort_na…
$ group_level      <chr> "overall", "overall", "overall", "viral_pharyngitis", "viral_pharyngitis", "viral_pharyngitis", "viral_…
$ strata_name      <chr> "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "ove…
$ strata_level     <chr> "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "ove…
$ variable_name    <chr> "settings", "settings", "settings", "Whiplash injury to neck", "Whiplash injury to neck", "Chronic sinu…
$ variable_level   <chr> NA, NA, NA, "-365 to -1", "-365 to -1", "-365 to -1", "-365 to -1", "-365 to -1", "-365 to -1", "-365 t…
$ estimate_name    <chr> "table_name", "type", "analysis", "count", "percentage", "count", "percentage", "count", "percentage", …
$ estimate_type    <chr> "character", "character", "character", "integer", "percentage", "integer", "percentage", "integer", "pe…
$ estimate_value   <chr> "condition_occurrence", "event", "standard", "15", "0.58", "20", "0.77", "18", "0.69", "260", "9.98", "…
$ additional_name  <chr> "overall", "overall", "overall", "concept_id", "concept_id", "concept_id", "concept_id", "concept_id", …
$ additional_level <chr> "overall", "overall", "overall", "4218389", "4218389", "257012", "257012", "4294548", "4294548", "40481…

tableLargeScaleCharacteristics

tableLargeScaleCharacteristics(result)

	CDM name
	Synthea synthetic health database
	Cohort name
	viral_pharyngitis
Concept	Window
Concept	-365 to -1	0 to 0	1 to 365
Table: condition_occurrence; Type: event; Analysis: standard
Viral sinusitis (40481087)	260 (10.0%)	-	259 (9.9%)
Otitis media (372328)	206 (7.9%)	-	158 (6.1%)
Acute bronchitis (260139)	136 (5.2%)	-	141 (5.4%)
Streptococcal sore throat (28060)	60 (2.3%)	-	70 (2.7%)
Sprain of ankle (81151)	40 (1.5%)	-	45 (1.7%)
Osteoarthritis (80180)	18 (0.7%)	-	31 (1.2%)
Acute bacterial sinusitis (4294548)	18 (0.7%)	-	24 (0.9%)
Sprain of wrist (78272)	19 (0.7%)	-	24 (0.9%)
Acute viral pharyngitis (4112343)	-	2,606 (100.0%)	137 (5.3%)
Fracture of forearm (4278672)	-	-	23 (0.9%)
Table: drug_exposure; Type: episode; Analysis: standard
poliovirus vaccine, inactivated (40213160)	311 (11.9%)	-	208 (8.0%)
Penicillin V Potassium 250 MG Oral Tablet (19133873)	59 (2.3%)	162 (6.2%)	225 (8.6%)
Aspirin 81 MG Oral Tablet (19059056)	199 (7.6%)	14 (0.5%)	197 (7.6%)
Acetaminophen 325 MG Oral Tablet (1127433)	141 (5.4%)	13 (0.5%)	156 (6.0%)
Acetaminophen 160 MG Oral Tablet (1127078)	138 (5.3%)	-	106 (4.1%)
Penicillin G 375 MG/ML Injectable Solution (19006318)	79 (3.0%)	-	51 (2.0%)
Haemophilus influenzae type b vaccine, PRP-OMP conjugate (40213314)	69 (2.6%)	-	53 (2.0%)
Amoxicillin 250 MG / Clavulanate 125 MG Oral Tablet (1713671)	67 (2.6%)	-	59 (2.3%)
tetanus and diphtheria toxoids, adsorbed, preservative free, for adult use (40213227)	27 (1.0%)	-	40 (1.5%)
Ampicillin 100 MG/ML Injectable Solution (19129655)	34 (1.3%)	-	24 (0.9%)

plotLargeScaleCharacteristics

lsc <- cdm$my_cohort %>%
  addSex() |>
  summariseLargeScaleCharacteristics(
    strata = list("sex"),
    window = c(-Inf,0),
    eventInWindow ="condition_occurrence"
  )
plotLargeScaleCharacteristics(
    data =  lsc |> 
      filter(estimate_name == "percentage"), 
    colorVars= c("strata_level")
  ) + 
  ylab("") +
  xlab("Percentage") +
  theme_minimal() +
  theme(legend.position = "top", legend.title = element_blank())

plotLargeScaleCharacteristics

summariseCohortOverlap

cdm <- generateConceptCohortSet(
  cdm = cdm,
  conceptSet = list(
    "bacterial_sinusitis" = 4294548,
    "viral_sinusitis" = 40481087,
    "chronic_sinusitis" = 257012,
    "any_sinusitis" = c(4294548, 40481087, 257012)
  ),
  name = "sinusitis"
)

summariseCohortOverlap

result <- summariseCohortOverlap(cdm$sinusitis)
result |>
  glimpse()

Rows: 72
Columns: 16
$ result_id        <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ cdm_name         <chr> "Synthea synthetic health database", "Synthea synthetic health database", "Synthea synthetic health dat…
$ result_type      <chr> "cohort_overlap", "cohort_overlap", "cohort_overlap", "cohort_overlap", "cohort_overlap", "cohort_overl…
$ package_name     <chr> "PatientProfiles", "PatientProfiles", "PatientProfiles", "PatientProfiles", "PatientProfiles", "Patient…
$ package_version  <chr> "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8…
$ group_name       <chr> "cohort_name_reference &&& cohort_name_comparator", "cohort_name_reference &&& cohort_name_comparator",…
$ group_level      <chr> "viral_sinusitis &&& any_sinusitis", "viral_sinusitis &&& any_sinusitis", "viral_sinusitis &&& any_sinu…
$ strata_name      <chr> "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "ove…
$ strata_level     <chr> "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "ove…
$ variable_name    <chr> "number_subjects", "number_subjects", "number_subjects", "number_subjects", "number_subjects", "number_…
$ variable_level   <chr> "overlap", "only_in_reference", "only_in_comparator", "overlap", "only_in_reference", "only_in_comparat…
$ estimate_name    <chr> "count", "count", "count", "count", "count", "count", "count", "count", "count", "count", "count", "cou…
$ estimate_type    <chr> "integer", "integer", "integer", "integer", "integer", "integer", "integer", "integer", "integer", "int…
$ estimate_value   <chr> "2686", "0", "2", "812", "0", "1876", "810", "1876", "2", "786", "0", "1902", "785", "1901", "1", "2686…
$ additional_name  <chr> "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "ove…
$ additional_level <chr> "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "ove…

tableCohortOverlap

tableCohortOverlap(result)

CDM name	Cohort name reference	Cohort name comparator	Estimate name	Number subjects
CDM name	Cohort name reference	Cohort name comparator	Estimate name	Only in reference	Only in comparator	Overlap
Synthea synthetic health database	Any sinusitis	Bacterial sinusitis	N (%)	1,902 (70.76%)	0 (0.00%)	786 (29.24%)
		Chronic sinusitis	N (%)	1,876 (69.79%)	0 (0.00%)	812 (30.21%)
		Viral sinusitis	N (%)	<5 (<5%)	0 (0.00%)	2,686 (99.93%)
	Bacterial sinusitis	Chronic sinusitis	N (%)	320 (28.27%)	346 (30.57%)	466 (41.17%)
		Viral sinusitis	N (%)	<5 (<5%)	1,901 (70.75%)	785 (29.21%)
	Chronic sinusitis	Viral sinusitis	N (%)	<5 (<5%)	1,876 (69.79%)	810 (30.13%)

plotCohortOverlap

plotCohortOverlap(result)

summariseCohortTiming

cdm <- generateIngredientCohortSet(
  cdm = cdm, name = "meds", ingredient = c("acetaminophen", "morphine", "warfarin")
)

summariseCohortTiming

meds_timing <- cdm$meds |> 
  summariseCohortTiming(restrictToFirstEntry = TRUE)
meds_timing |> 
  glimpse()

Rows: 43
Columns: 16
$ result_id        <int> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
$ cdm_name         <chr> "Synthea synthetic health database", "Synthea synthetic health database", "Synthea synthetic health dat…
$ result_type      <chr> "cohort_timing", "cohort_timing", "cohort_timing", "cohort_timing", "cohort_timing", "cohort_timing", "…
$ package_name     <chr> "PatientProfiles", "PatientProfiles", "PatientProfiles", "PatientProfiles", "PatientProfiles", "Patient…
$ package_version  <chr> "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8.0", "0.8…
$ group_name       <chr> "cohort_name_reference &&& cohort_name_comparator", "cohort_name_reference &&& cohort_name_comparator",…
$ group_level      <chr> "warfarin &&& acetaminophen", "acetaminophen &&& warfarin", "acetaminophen &&& morphine", "morphine &&&…
$ strata_name      <chr> "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "ove…
$ strata_level     <chr> "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "ove…
$ variable_name    <chr> "number records", "number records", "number records", "number records", "number records", "number recor…
$ variable_level   <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
$ estimate_name    <chr> "count", "count", "count", "count", "count", "count", "count", "count", "count", "count", "count", "cou…
$ estimate_type    <chr> "integer", "integer", "integer", "integer", "integer", "integer", "integer", "integer", "integer", "int…
$ estimate_value   <chr> "136", "136", "35", "35", "6", "6", "136", "136", "35", "35", "6", "6", "-33784", "-1106", "-12316", "-…
$ additional_name  <chr> "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "ove…
$ additional_level <chr> "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "overall", "ove…

tableCohortTiming

tableCohortTiming(
  meds_timing, 
  .options = list(decimals = c(numeric = 0)), 
  excludeColumns = c("cdm_name", "result_id", "result_type", "package_name", "package_version", "estimate_type")
)

Cohort name reference	Cohort name comparator	Variable name	Variable level	Estimate name	Estimate value
Acetaminophen	Morphine	Number records	-	N	35
		Number subjects	-	N	35
		Diff days	-	Median [Q25 - Q75]	5,769 [1,835 - 12,239]
				Range	-12,316 - 28,231
	Warfarin	Number records	-	N	136
		Number subjects	-	N	136
		Diff days	-	Median [Q25 - Q75]	19,709 [16,926 - 24,462]
				Range	-1,106 - 33,784
Morphine	Warfarin	Number records	-	N	6
		Number subjects	-	N	6
		Diff days	-	Median [Q25 - Q75]	1,658 [-1,737 - 3,783]
				Range	-3,376 - 6,937

plotCohortTiming

plotCohortTiming(meds_timing, facetVarX = "cdm_name") + 
  theme_bw() +
  theme(legend.position = "none", axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1))

plotCohortTiming

meds_timing <- cdm$meds |> 
  summariseCohortTiming(restrictToFirstEntry = TRUE, density = TRUE)
plotCohortTiming(meds_timing, plotType = "density")

Overview

Provide the functionalities to characterise cohorts:
- Table one
- Large scale characteristics
- Cohort overlap
- Cohort timing
Produces standard tables in gt, flextable or tibble format.
Produces standard plot visualisations based on ggplot2 package.
It is designed for users in Off The Shelf Studies.

Roadmap

https://darwin-eu-dev.github.io/PatientProfiles/
split into the two packages
PatientProfiles to 1.0.0 (stable, it has already not changed in the last 6 months)
CohortCharacteristics to 1.0.0 (close, we need users opinions for an stable release)

PatientProfiles tutorial

PatientProfiles

Scope

PatientProfiles (developer focus)

CohortCharacteristics (user focus)

Contents

addSex()

addAge()

addAge()

addAge()

addPriorObservation()

addPriorObservation()

addPriorObservation()

addPriorObservation()

addPriorObservation()

addPriorObservation()

addFutureObservation()

addFutureObservation()

addFutureObservation()

addFutureObservation()

addFutureObservation()

addInObservation()

addInObservation()

addInObservation()

addInObservation()

addInObservation()

addInObservation() window

addDateOfBirth()

addDemographics()

addDemographics()

add intersections overview

origin table

add intersections overview

target

add intersections overview

Estimate

12 functions

addCohortIntersectFlag

addTableIntersectCount

addCohortIntersectDate

addConceptIntersectDays

Analysis example

Summarise data

Summarise data

Summarise data

Summarise data

Summarise data

Overview

CohortCharacteristics

Contents

summariseCharacteristics

summariseCharacteristics

tableCharacteristics

tableCharacteristics

tableCharacteristics

tableDemographics

plotDemographics

tableCohortIntersect

plotCohortIntersect

summariseLargeScaleCharacterisation

tableLargeScaleCharacteristics

plotLargeScaleCharacteristics

plotLargeScaleCharacteristics

summariseCohortOverlap

summariseCohortOverlap

tableCohortOverlap

plotCohortOverlap

summariseCohortTiming

summariseCohortTiming

tableCohortTiming

plotCohortTiming

plotCohortTiming

Overview

Roadmap

Usage

Thanks