CodelistGenerator and PhenotypeR

CodelistGenerator and Phenotype R

Working with the OMOP CDM vocabulary tables & characterising our cohorts

Outline

CodelistGenerator

  1. Introduction to Vocabulary Tables
  2. Create Codelists Using CodelistGenerator
  3. Codelist Diagnostics

PhenotypeR

  1. Introduction to PhenotypeR

1. Introduction to Vocabulary Tables

1.1 Vocabulary Tables

  • In OMOP, vocabularies refer to a set of clinical terminologies used to represent health-related concepts in a consistent way across different database.

  • ATHENA is the official OHDSI web-based tool used to access the OMOP vocabularies.

1.1 Vocabulary Tables

Example: Asthma

Concept: Single clinical idea with unique combination of concept_id, concept_name, domain, and standard concept status.

1.1 Vocabulary Tables

Example: Asthma

Concept: Single clinical idea with unique combination of concept_id, concept_name, domain, and standard concept status.

1.1 Vocabulary Tables

Example: Asthma

Concept: Single clinical idea with unique combination of concept_id, concept_name, domain, and standard concept status.

1.1 Vocabulary Tables

Example: Asthma

Concept relationships: Mapped to, has descendant, has ancestor, etc.

1.2 CDM Vocabulary Tables

Where can we find all this information in our CDM?

Connect to a mock cdm:

# install.packages(c("CDMConnector", "dplyr", "tidyr", "DBI", "CodelistGenerator", "duckdb"))
library(CDMConnector)
library(dplyr)
library(tidyr)
library(DBI)
library(omopgenerics)
library(CodelistGenerator)

CDMConnector::requireEunomia("synpuf-1k", "5.3")

con <- DBI::dbConnect(duckdb::duckdb(), 
                      CDMConnector::eunomiaDir("synpuf-1k", "5.3"))
cdm <- CDMConnector::cdmFromCon(con = con, 
                                cdmName = "Eunomia Synpuf",
                                cdmSchema   = "main",
                                writeSchema = "main", 
                                achillesSchema = "main")
cdm
── # OMOP CDM reference (duckdb) of Eunomia Synpuf ───────────────────────────────────────────────────────────────────────────────
• omop tables: person, observation_period, visit_occurrence, visit_detail, condition_occurrence, drug_exposure,
procedure_occurrence, device_exposure, measurement, observation, death, note, note_nlp, specimen, fact_relationship, location,
care_site, provider, payer_plan_period, cost, drug_era, dose_era, condition_era, metadata, cdm_source, concept, vocabulary,
domain, concept_class, concept_relationship, relationship, concept_synonym, concept_ancestor, source_to_concept_map,
drug_strength, cohort_definition, attribute_definition
• cohort tables: -
• achilles tables: achilles_analysis, achilles_results, achilles_results_dist
• other tables: -

1.2 CDM Vocabulary Tables

Concept table:

cdm$concept |> glimpse()
Rows: ??
Columns: 10
Database: DuckDB v1.1.0 [martaa@Windows 10 x64:R 4.4.1/C:\Users\martaa\AppData\Local\Temp\RtmpwFCqVw\file78788cf5cd8.duckdb]
$ concept_id       <int> 43674447, 43674448, 43674449, 43674450, 43674451, 43674452, 43674453, 43674454, 43674455, 43674456, 436…
$ concept_name     <chr> "Benperidol 2 MG [Glianimon]", "Benperidol 2 MG Oral Tablet [Benperidol-Neuraxpharm] by Neuraxpharm", "…
$ domain_id        <chr> "Drug", "Drug", "Drug", "Drug", "Drug", "Drug", "Drug", "Drug", "Drug", "Drug", "Drug", "Drug", "Drug",…
$ vocabulary_id    <chr> "RxNorm Extension", "RxNorm Extension", "RxNorm Extension", "RxNorm Extension", "RxNorm Extension", "Rx…
$ concept_class_id <chr> "Branded Drug Comp", "Marketed Product", "Branded Drug", "Clinical Drug Box", "Clinical Drug Box", "Bra…
$ standard_concept <chr> "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", NA,…
$ concept_code     <chr> "OMOP929237", "OMOP929240", "OMOP929249", "OMOP929272", "OMOP929273", "OMOP929308", "OMOP929324", "OMOP…
$ valid_start_date <date> 2017-08-24, 2017-08-24, 2017-08-24, 2017-08-24, 2017-08-24, 2017-08-24, 2017-08-24, 2017-08-24, 2017-0…
$ valid_end_date   <date> 2099-12-31, 2099-12-31, 2099-12-31, 2099-12-31, 2099-12-31, 2099-12-31, 2099-12-31, 2099-12-31, 2099-1…
$ invalid_reason   <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, "U", "U", NA, NA, NA, N…

Example:

cdm$concept |>
  filter(concept_name == "Asthma",
         vocabulary_id == "SNOMED",
         standard_concept == "S")
# Source:   SQL [?? x 10]
# Database: DuckDB v1.1.0 [martaa@Windows 10 x64:R 4.4.1/C:\Users\martaa\AppData\Local\Temp\RtmpwFCqVw\file78788cf5cd8.duckdb]
  concept_id concept_name domain_id vocabulary_id concept_class_id standard_concept concept_code valid_start_date valid_end_date
       <int> <chr>        <chr>     <chr>         <chr>            <chr>            <chr>        <date>           <date>        
1     317009 Asthma       Condition SNOMED        Clinical Finding S                195967001    2002-01-31       2099-12-31    
# ℹ 1 more variable: invalid_reason <chr>

1.2 CDM Vocabulary Tables

Concept ancestor table:

cdm$concept_ancestor |> glimpse()
Rows: ??
Columns: 4
Database: DuckDB v1.1.0 [martaa@Windows 10 x64:R 4.4.1/C:\Users\martaa\AppData\Local\Temp\RtmpwFCqVw\file78788cf5cd8.duckdb]
$ ancestor_concept_id      <int> 262, 8478, 8479, 8480, 8482, 8483, 8484, 8487, 8488, 8494, 8495, 8496, 8504, 8505, 8506, 8507, …
$ descendant_concept_id    <int> 262, 8478, 8479, 8480, 8482, 8483, 8484, 8487, 8488, 8494, 8495, 8496, 8504, 8505, 8506, 8507, …
$ min_levels_of_separation <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1,…
$ max_levels_of_separation <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1,…

Example:

cdm$concept_ancestor |>
  filter(ancestor_concept_id == "317009") |>
  inner_join(cdm$concept |>
               select("descendant_concept_id" = "concept_id", 
                      "descendant_concept_name" = "concept_name"),
             by = "descendant_concept_id") |>
  select("descendant_concept_name", "min_levels_of_separation")
# Source:   SQL [?? x 2]
# Database: DuckDB v1.1.0 [martaa@Windows 10 x64:R 4.4.1/C:\Users\martaa\AppData\Local\Temp\RtmpwFCqVw\file78788cf5cd8.duckdb]
   descendant_concept_name                                                                          min_levels_of_separation
   <chr>                                                                                                               <int>
 1 Acute severe exacerbation of asthma co-occurrent with allergic rhinitis                                                 4
 2 Severe persistent allergic asthma                                                                                       2
 3 Acute severe exacerbation of severe persistent asthma co-occurrent with allergic rhinitis                               4
 4 Acute severe exacerbation of mild persistent allergic asthma co-occurrent with allergic rhinitis                        4
 5 Moderate persistent allergic asthma                                                                                     2
 6 Acute severe exacerbation of moderate persistent asthma co-occurrent with allergic rhinitis                             4
 7 Aspirin exacerbated respiratory disease                                                                                 4
 8 Chronic obstructive asthma co-occurrent with acute exacerbation of asthma                                               3
 9 Severe persistent asthma co-occurrent with allergic rhinitis                                                            3
10 Mild persistent asthma co-occurrent with allergic rhinitis                                                              3
# ℹ more rows

1.2 CDM Vocabulary Tables

Concept relationship table:

cdm$concept_relationship |> glimpse()
Rows: ??
Columns: 6
Database: DuckDB v1.1.0 [martaa@Windows 10 x64:R 4.4.1/C:\Users\martaa\AppData\Local\Temp\RtmpwFCqVw\file78788cf5cd8.duckdb]
$ concept_id_1     <int> 5, 6, 9, 11, 14, 18, 20, 22, 26, 28, 37, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 233, 234, 262, 262…
$ concept_id_2     <int> 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 58, 5, 6, 9, 11, 14, 18, 20, 22, 26, 28, 37, 44818955, 44818954…
$ relationship_id  <chr> "Concept replaced by", "Concept replaced by", "Concept replaced by", "Concept replaced by", "Concept re…
$ valid_start_date <date> 2015-10-16, 2015-10-16, 2015-10-16, 2015-10-16, 2015-10-16, 2015-10-16, 2015-10-16, 2015-10-16, 2015-1…
$ valid_end_date   <date> 2099-12-31, 2099-12-31, 2099-12-31, 2099-12-31, 2099-12-31, 2099-12-31, 2099-12-31, 2099-12-31, 2099-1…
$ invalid_reason   <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…

Example:

cdm$concept_relationship |>
  filter(concept_id_1 == "317009") |>
  inner_join(cdm$concept |>
               select("concept_id_2" = "concept_id", 
                      "concept_name"),
             by = "concept_id_2") |>
  select("concept_name", "concept_id_2", "relationship_id")
# Source:   SQL [?? x 3]
# Database: DuckDB v1.1.0 [martaa@Windows 10 x64:R 4.4.1/C:\Users\martaa\AppData\Local\Temp\RtmpwFCqVw\file78788cf5cd8.duckdb]
   concept_name                                                  concept_id_2 relationship_id 
   <chr>                                                                <int> <chr>           
 1 Steroid dependent asthma                                          46273635 Subsumes        
 2 Asthma-chronic obstructive pulmonary disease overlap syndrome     46274062 Subsumes        
 3 Asthma without status asthmaticus                                  4206340 Subsumes        
 4 Airway structure                                                   4230487 Has finding site
 5 Asthmatic bronchitis                                               4233784 Subsumes        
 6 Asthma                                                             4264780 SNOMED - ind/CI 
 7 Bronchial Hyperreactivity                                          4267454 SNOMED - ind/CI 
 8 Asthma education                                                   4293727 Focus of        
 9 Asthma confirmed                                                   4293734 Asso finding of 
10 Substance induced asthma                                           4312524 Subsumes        
# ℹ more rows

1.2 CDM Vocabulary Tables

Concept synonym table:

cdm$concept_synonym |> glimpse()
Rows: ??
Columns: 3
Database: DuckDB v1.1.0 [martaa@Windows 10 x64:R 4.4.1/C:\Users\martaa\AppData\Local\Temp\RtmpwFCqVw\file78788cf5cd8.duckdb]
$ concept_id           <int> 33, 8689, 8691, 8715, 8715, 9173, 9174, 9176, 9176, 9177, 9181, 9188, 9189, 9189, 9189, 9190, 9190,…
$ concept_synonym_name <chr> "Provider specialty", "Flu due to unidentified influenza virus w oth resp manifest", "Maternal care…
$ language_concept_id  <int> 4180186, 4180186, 4180186, 4180186, 4180186, 4180186, 4180186, 4180186, 4180186, 4180186, 4180186, …

Example:

cdm$concept_synonym |>
  filter(concept_id == "317009")
# Source:   SQL [?? x 3]
# Database: DuckDB v1.1.0 [martaa@Windows 10 x64:R 4.4.1/C:\Users\martaa\AppData\Local\Temp\RtmpwFCqVw\file78788cf5cd8.duckdb]
  concept_id concept_synonym_name            language_concept_id
       <int> <chr>                                         <int>
1     317009 Asthma (disorder)                           4180186
2     317009 Bronchial hyperresponsiveness               4180186
3     317009 Airway hyperreactivity                      4180186
4     317009 Bronchial hyperreactivity                   4180186
5     317009 Bronchial asthma                            4180186
6     317009 Bronchial hypersensitivity                  4180186
7     317009 BHR - Bronchial hyperreactivity             4180186
8     317009 Hyperreactive airway disease                4180186
9     317009 Asthmatic                                   4180186

1.3 Exploring Vocabulary Tables with CodelistGenerator

Search results will be specific to the version of the vocabulary being used

[1] "v5.0 06-AUG-21"

Which vocabularies are available?

getVocabularies(cdm = cdm)
 [1] "APC"                  "ATC"                  "BDPM"                 "CMS Place of Service" "Cohort"              
 [6] "Concept Class"        "Condition Type"       "CPT4"                 "Currency"             "Death Type"          
[11] "Device Type"          "Domain"               "DPD"                  "DRG"                  "Drug Type"           
[16] "Ethnicity"            "Gemscript"            "Gender"               "HCPCS"                "HES Specialty"       
[21] "ICD10"                "ICD10CM"              "ICD9CM"               "ICD9Proc"             "LOINC"               
[26] "MDC"                  "Meas Type"            "Multilex"             "Multum"               "NDC"                 
[31] "NDFRT"                "None"                 "Note Type"            "NUCC"                 "Obs Period Type"     
[36] "Observation Type"     "OPCS4"                "OXMIS"                "PCORNet"              "Procedure Type"      
[41] "Provider"             "Race"                 "Read"                 "Relationship"         "Revenue Code"        
[46] "RxNorm"               "RxNorm Extension"     "SMQ"                  "SNOMED"               "SPL"                 
[51] "Supplier"             "UCUM"                 "VA Class"             "VA Product"           "Visit"               
[56] "Visit Type"           "Vocabulary"          

1.3 Exploring Vocabulary Tables with CodelistGenerator

Which domains are present?

 [1] "Race"                "Gender"              "Device"              "Drug"                "Procedure"          
 [6] "Meas Value"          "Ethnicity"           "Provider"            "Metadata"            "Condition"          
[11] "Visit"               "Relationship"        "Spec Disease Status" "Observation"         "Unit"               
[16] "Measurement"         "Meas Value Operator" "Spec Anatomic Site"  "Revenue Code"        "Route"              
[21] "Specimen"            "Currency"           

What concept classes are present?

getConceptClassId(cdm,
                  standardConcept = "Standard",
                  domain = "Drug")
 [1] "Branded Drug"        "Branded Drug Box"    "Branded Drug Comp"   "Branded Drug Form"   "Branded Pack"       
 [6] "Branded Pack Box"    "Clinical Drug"       "Clinical Drug Box"   "Clinical Drug Comp"  "Clinical Drug Form" 
[11] "Clinical Pack"       "Clinical Pack Box"   "CPT4"                "CPT4 Modifier"       "HCPCS"              
[16] "Ingredient"          "Marketed Product"    "Quant Branded Box"   "Quant Branded Drug"  "Quant Clinical Box" 
[21] "Quant Clinical Drug"

1.3 Exploring Vocabulary Tables with CodelistGenerator

What relationships do we have between standard concepts?

getRelationshipId(cdm, 
                  standardConcept1 = c("standard"),
                  standardConcept2 = c("standard"), 
                  domains1 = "condition", 
                  domains2 = "condition")
 [1] "Asso finding of"   "Asso with finding" "Due to of"         "Finding asso with" "Followed by"       "Follows"          
 [7] "Has asso finding"  "Has due to"        "Has manifestation" "Is a"              "Manifestation of"  "Mapped from"      
[13] "Maps to"           "Occurs after"      "Occurs before"     "Subsumes"         

What relationships do we have between non-standard to standard concepts?

getRelationshipId(cdm, 
                  standardConcept1 = c("standard"),
                  standardConcept2 = c("non-standard"), 
                  domains1 = "condition", 
                  domains2 = "condition")
 [1] "Asso finding of"      "Asso with finding"    "Concept alt_to from"  "Concept poss_eq from" "Concept replaces"    
 [6] "Concept same_as from" "Concept was_a from"   "Due to of"            "Finding asso with"    "Followed by"         
[11] "Follows"              "Has asso morph"       "Has due to"           "Has interprets"       "Has manifestation"   
[16] "Is a"                 "Manifestation of"     "Mapped from"          "Occurs after"         "Occurs before"       
[21] "Subsumes"             "Value mapped from"   

1.3 Exploring Vocabulary Tables with CodelistGenerator

What dose forms do we have in the database?

getDoseForm(cdm) |> head(n = 20)
 [1] "Augmented Topical Cream"               "Augmented Topical Gel"                 "Augmented Topical Lotion"             
 [4] "Augmented Topical Ointment"            "Auto-Injector"                         "Bar Soap"                             
 [7] "Bath Salts"                            "Buccal Film"                           "Buccal Tablet"                        
[10] "Cartridge"                             "Cement"                                "Chewable Bar"                         
[13] "Chewable Extended Release Oral Tablet" "Chewable Tablet"                       "Chewing Gum"                          
[16] "Collodion"                             "Delayed Release Oral Capsule"          "Delayed Release Oral Granules"        
[19] "Delayed Release Oral Tablet"           "Dental Pin"                           

What routes are available?

getRouteCategories(cdm) |> head(n = 20)
 [1] "implant"                           "inhalable"                         "injectable"                       
 [4] "oral"                              "topical"                           "transdermal"                      
 [7] "transmucosal"                      "transmucosal_buccal_or_sublingual" "transmucosal_nasal"               
[10] "transmucosal_rectal"               "transmucosal_vaginal"              "unclassified_route"               

What dose units are present?

getDoseUnit(cdm) |> head(n = 20)
 [1] "50% cell culture infectious dose"         "50% tissue culture infectious dose"      
 [3] "Actuation"                                "allergenic unit"                         
 [5] "bacteria"                                 "bioequivalent allergenic unit"           
 [7] "cells"                                    "clinical unit"                           
 [9] "colony forming unit"                      "focus forming unit"                      
[11] "homeopathic potency of centesimal series" "homeopathic potency of decimal series"   
[13] "index of reactivity"                      "international unit"                      
[15] "limit of flocculation unit"               "mega-international unit"                 
[17] "microgram"                                "microkatal"                              
[19] "millicurie"                               "milliequivalent"                         

1.4 ACHILLES Tables

  • ACHILLES are additional tables that provide descriptive statistics of an OMOP CDM database.
  • They are typically run after performing ETL.
cdm
── # OMOP CDM reference (duckdb) of Eunomia Synpuf ───────────────────────────────────────────────────────────────────────────────
• omop tables: person, observation_period, visit_occurrence, visit_detail, condition_occurrence, drug_exposure,
procedure_occurrence, device_exposure, measurement, observation, death, note, note_nlp, specimen, fact_relationship, location,
care_site, provider, payer_plan_period, cost, drug_era, dose_era, condition_era, metadata, cdm_source, concept, vocabulary,
domain, concept_class, concept_relationship, relationship, concept_synonym, concept_ancestor, source_to_concept_map,
drug_strength, cohort_definition, attribute_definition
• cohort tables: -
• achilles tables: achilles_analysis, achilles_results, achilles_results_dist
• other tables: -

1.4 ACHILLES Tables

The interesting part of CodelistGenerator:

1.4 ACHILLES Tables

Counts for asthma:

2. Generate Codelists Using CodelistGenerator

2.1 Systematic Search of Codes Using CodelistGenerator

  • We are now going to create a codelist for asthma.
  • We will first identify a set of concepts that may be relevant.
asthma_codes <- getCandidateCodes(
  cdm = cdm,
  keywords = "asthma",
  domains = "Condition", 
  includeDescendants = TRUE)

asthma_codes %>% glimpse()
Rows: 187
Columns: 6
$ concept_id       <int> 46269770, 46269778, 46269789, 46270029, 46270082, 46273452, 46274124, 44810117, 4125022, 4141978, 41554…
$ found_from       <chr> "From initial search", "From initial search", "From initial search", "From initial search", "From initi…
$ concept_name     <chr> "Severe persistent allergic asthma", "Mild persistent asthma controlled", "Moderate persistent asthma u…
$ domain_id        <chr> "Condition", "Condition", "Condition", "Condition", "Condition", "Condition", "Condition", "Condition",…
$ vocabulary_id    <chr> "SNOMED", "SNOMED", "SNOMED", "SNOMED", "SNOMED", "SNOMED", "SNOMED", "SNOMED", "SNOMED", "SNOMED", "SN…
$ standard_concept <chr> "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S", "S"…

2.1 Systematic Search of Codes Using CodelistGenerator

asthma_codes |> select("found_from") |> distinct()
# A tibble: 2 × 1
  found_from         
  <chr>              
1 From initial search
2 From descendants   
asthma_codes |> select("domain_id") |> distinct()
# A tibble: 1 × 1
  domain_id
  <chr>    
1 Condition
asthma_codes |> select("standard_concept") |> distinct()
# A tibble: 1 × 1
  standard_concept
  <chr>           
1 S               

2.1 Systematic Search of Codes Using CodelistGenerator

  • We are now going to create a codelist for asthma excluding the character child (includes child, childhood, children, etc…)
asthma_codes_with_exclusion <- getCandidateCodes(
  cdm = cdm,
  keywords = "asthma",
  exclude  = "child",
  domains = "Condition", 
  includeDescendants = TRUE)
  • And compare both codelists:
compareCodelists(asthma_codes, asthma_codes_with_exclusion) |>
  filter(codelist == "Only codelist 1")
# A tibble: 2 × 5
  concept_id concept_name                             codelist_1 codelist_2 codelist       
       <int> <chr>                                         <dbl>      <dbl> <chr>          
1    4051466 Childhood asthma                                  1         NA Only codelist 1
2   45772073 Asthma in mother complicating childbirth          1         NA Only codelist 1

2.2 Generating Vocabulary Based Codelists for Medications

  • We are now going to create a codelist based on the ingredient acetaminophen.
acetaminophen <- getDrugIngredientCodes(cdm = cdm, 
                                        name = "acetaminophen", 
                                        nameStyle = "{concept_name}", 
                                        doseUnit = "milligram", 
                                        routeCategory = NULL,
                                        doseForm = NULL,
                                        ingredientRange = c(1,Inf))
acetaminophen

- acetaminophen (22256 codes)

2.2 Generating Vocabulary Based Codelists for Medications

  • Now we are going to stratify the codelist into different route categories:
acetaminophen_stratified <- stratifyByRouteCategory(acetaminophen, 
                                                    cdm = cdm, 
                                                    keepOriginal = TRUE)
acetaminophen_stratified 

- acetaminophen (22256 codes)
- acetaminophen_inhalable (3 codes)
- acetaminophen_injectable (689 codes)
- acetaminophen_oral (17219 codes)
- acetaminophen_topical (6 codes)
- acetaminophen_transmucosal_rectal (1459 codes)
along with 1 more codelists
  • Other stratification functions:
acetaminophen_by_concept <- stratifyByConcept(cdm)
acetaminophen_by_route   <- stratifyByDoseUnit(cdm)

2.2 Generating Vocabulary Based Codelists for Medications

  • Subset only to codes with a specific route:
acetaminophen_oral <- subsetOnRouteCategory(acetaminophen, 
                                            cdm = cdm, 
                                            routeCategory = "oral")
acetaminophen_oral |> stratifyByRouteCategory(cdm)

- acetaminophen_oral (17219 codes)
  • Subset only to codes without a specific route:
acetaminophen_no_oral <- subsetOnRouteCategory(acetaminophen, 
                                            cdm = cdm, 
                                            routeCategory = "oral",
                                            negate = TRUE)
acetaminophen_no_oral |> stratifyByRouteCategory(cdm)

- acetaminophen_inhalable (3 codes)
- acetaminophen_injectable (689 codes)
- acetaminophen_topical (6 codes)
- acetaminophen_transmucosal_rectal (1459 codes)
  • Other functions to subset a codelist:
subsetOnDomain(acetaminophen, cdm. domain = "Drug")
subsetOnDoseUnit(acetaminophen, cdm, doseUnit = "milligram")
subsetToCodesInUse(acetaminophen, cdm)

2.3 Generating a Codelist Based on ATC Codes

atc <- getATCCodes(cdm = cdm, level = "ATC 1st")
atc

- A_alimentary_tract_and_metabolism (211265 codes)
- B_blood_and_blood_forming_organs (83485 codes)
- C_cardiovascular_system (233725 codes)
- D_dermatologicals (118870 codes)
- G_genito_urinary_system_and_sex_hormones (64486 codes)
- H_systemic_hormonal_preparations_excl_sex_hormones_and_insulins (36165 codes)
along with 8 more codelists

2.4 Generating a Codelist Based on ICD10 Chapters

  • We can generate codelists based on ICD10 chapters and subchapters
  • Four different levels of granularity:
availableICD10(cdm, level = "ICD10 Chapter") |> 
  glimpse()
 chr [1:22] "Certain infectious and parasitic diseases" "Neoplasms" ...
availableICD10(cdm, level = "ICD10 SubChapter") |> 
  glimpse()
 chr [1:274] "Intestinal infectious diseases" "Tuberculosis" "Certain zoonotic bacterial diseases" "Other bacterial diseases" ...
availableICD10(cdm, level = "ICD10 Hierarchy") |> 
  glimpse()
 chr [1:2093] "Other salmonella infections" "Other bacterial intestinal infections" "Amoebiasis" "Brucellosis" "Other sepsis" ...
availableICD10(cdm, level = "ICD10 Code") |> 
  glimpse()
 chr [1:14130] "Enteropathogenic Escherichia coli infection" "Enteroinvasive Escherichia coli infection" ...

2.4 Generating a Codelist Based on ICD10 Chapters

icd_chapters <- getICD10StandardCodes(cdm = cdm,
                                      name = NULL, 
                                      level = c("ICD10 Chapter"),
                                      includeDescendants = TRUE)
icd_chapters 

- i_certain_infectious_and_parasitic_diseases (65191 codes)
- ii_neoplasms (16262 codes)
- iii_diseases_of_the_blood_and_blood_forming_organs_and_certain_disorders_involving_the_immune_mechanism (6604 codes)
- iv_endocrine_nutritional_and_metabolic_diseases (13483 codes)
- ix_diseases_of_the_circulatory_system (38407 codes)
- v_mental_and_behavioural_disorders (4602 codes)
along with 16 more codelists
mental_and_behavioural_disorders <- getICD10StandardCodes(cdm = cdm,
                                      name = "Mental and behavioural disorders", 
                                      level = "ICD10 Chapter",
                                      includeDescendants = TRUE)
mental_and_behavioural_disorders

- v_mental_and_behavioural_disorders (4602 codes)

2.5 SQL Server Practical

3 Codelist Diagnostics

3.1 Code Counts Using ACHILLES Tables

  • Let’s now characterise our asthma codelist obtained using getCandidateCodes()
  • First, let’s turn our data table into a codelist object:
class(asthma_codes)
[1] "tbl_df"     "tbl"        "data.frame"
asthma <- newCodelist(list("asthma" = asthma_codes$concept_id))

class(asthma)
[1] "codelist" "list"    
  • Then, let’s summarise the number of counts using ACHILLES tables:
asthma_achilles <- summariseAchillesCodeUse(asthma,
                                            cdm = cdm,
                                            countBy = c("record", "person"))

3.1 Code Counts Using ACHILLES Tables

tableAchillesCodeUse(asthma_achilles)
Database name
Eunomia Synpuf
Codelist name Domain ID Standard concept name Standard concept ID Standard concept Vocabulary ID
Estimate name
Record count Person count
asthma condition Intrinsic asthma without status asthmaticus 252658 standard SNOMED 39 25
Asthmatic pulmonary eosinophilia 252942 standard SNOMED 72 59
Chronic asthmatic bronchitis 256448 standard SNOMED 74 63
Exacerbation of asthma 257581 standard SNOMED 49 40
Allergic bronchopulmonary aspergillosis 257583 standard SNOMED 5 5
IgE-mediated allergic asthma 312950 standard SNOMED 67 45
Cough variant asthma 313236 standard SNOMED 11 10
Asthma 317009 standard SNOMED 274 183
Acute exacerbation of chronic asthmatic bronchitis 40481763 standard SNOMED 33 29
Exercise-induced asthma 443801 standard SNOMED 8 8
Acute severe exacerbation of asthma 45769438 standard SNOMED 11 10
Acute exacerbation of allergic asthma 45769441 standard SNOMED 7 7
Acute severe exacerbation of immunoglobin E-mediated allergic asthma 45769442 standard SNOMED 12 12
Acute severe exacerbation of intrinsic asthma 45769443 standard SNOMED 7 7
Acute exacerbation of intrinsic asthma 45773005 standard SNOMED 18 13

3.2 Code Counts Using Patient-Level Data

  • Let’s now count the number of records and person based on the database counts.
asthma_code_use <- summariseCodeUse(asthma,
                                    cdm = cdm,
                                    countBy = c("record", "person"),
                                    byYear = FALSE,
                                    bySex = TRUE,
                                    ageGroup = NULL,
                                    dateRange = as.Date(c(NA,NA)))

3.2 Code Counts Using Patient-Level Data

tableCodeUse(asthma_code_use, groupColumn = "sex")
Database name
Eunomia Synpuf
Codelist name Standard concept name Standard concept ID Source concept name Source concept ID Source concept value Domain ID
Estimate name
Record count Person count
overall
asthma overall - NA NA NA NA 687 301
Acute severe exacerbation of asthma 45769438 Asthma, unspecified type, with status asthmaticus 44832424 49391 condition 11 10
Acute severe exacerbation of immunoglobin E-mediated allergic asthma 45769442 Extrinsic asthma with status asthmaticus 44824288 49301 condition 12 12
Intrinsic asthma without status asthmaticus 252658 Intrinsic asthma, unspecified 44824289 49310 condition 39 25
Acute exacerbation of chronic asthmatic bronchitis 40481763 Chronic obstructive asthma with (acute) exacerbation 44837136 49322 condition 33 29
Acute severe exacerbation of intrinsic asthma 45769443 Intrinsic asthma with status asthmaticus 44834769 49311 condition 7 7
Acute exacerbation of intrinsic asthma 45773005 Intrinsic asthma with (acute) exacerbation 44837135 49312 condition 18 13
Allergic bronchopulmonary aspergillosis 257583 Allergic bronchopulmonary aspergillosis 44831284 5186 condition 5 5
Cough variant asthma 313236 Cough variant asthma 44833611 49382 condition 11 10
Asthmatic pulmonary eosinophilia 252942 Pulmonary eosinophilia 44827830 5183 condition 72 59
IgE-mediated allergic asthma 312950 Extrinsic asthma, unspecified 44824287 49300 condition 67 45
Chronic asthmatic bronchitis 256448 Chronic obstructive asthma with status asthmaticus 44820889 49321 condition 12 12
Acute exacerbation of allergic asthma 45769441 Extrinsic asthma with (acute) exacerbation 44831278 49302 condition 7 7
Exacerbation of asthma 257581 Asthma, unspecified type, with (acute) exacerbation 44823144 49392 condition 49 40
Exercise-induced asthma 443801 Exercise induced bronchospasm 44826678 49381 condition 8 8
Asthma 317009 Asthma, unspecified type, unspecified 44821988 49390 condition 274 183
Chronic asthmatic bronchitis 256448 Chronic obstructive asthma, unspecified 44831280 49320 condition 62 52
Female
asthma overall - NA NA NA NA 402 164
Acute severe exacerbation of intrinsic asthma 45769443 Intrinsic asthma with status asthmaticus 44834769 49311 condition 4 4
Asthma 317009 Asthma, unspecified type, unspecified 44821988 49390 condition 156 98
Acute exacerbation of allergic asthma 45769441 Extrinsic asthma with (acute) exacerbation 44831278 49302 condition 4 4
Acute exacerbation of chronic asthmatic bronchitis 40481763 Chronic obstructive asthma with (acute) exacerbation 44837136 49322 condition 23 19
Intrinsic asthma without status asthmaticus 252658 Intrinsic asthma, unspecified 44824289 49310 condition 15 14
Asthmatic pulmonary eosinophilia 252942 Pulmonary eosinophilia 44827830 5183 condition 38 37
IgE-mediated allergic asthma 312950 Extrinsic asthma, unspecified 44824287 49300 condition 48 28
Acute exacerbation of intrinsic asthma 45773005 Intrinsic asthma with (acute) exacerbation 44837135 49312 condition 12 7
Chronic asthmatic bronchitis 256448 Chronic obstructive asthma with status asthmaticus 44820889 49321 condition 8 8
Exercise-induced asthma 443801 Exercise induced bronchospasm 44826678 49381 condition 6 6
Acute severe exacerbation of immunoglobin E-mediated allergic asthma 45769442 Extrinsic asthma with status asthmaticus 44824288 49301 condition 4 4
Exacerbation of asthma 257581 Asthma, unspecified type, with (acute) exacerbation 44823144 49392 condition 30 23
Acute severe exacerbation of asthma 45769438 Asthma, unspecified type, with status asthmaticus 44832424 49391 condition 8 7
Cough variant asthma 313236 Cough variant asthma 44833611 49382 condition 9 8
Chronic asthmatic bronchitis 256448 Chronic obstructive asthma, unspecified 44831280 49320 condition 37 29
Male
asthma overall - NA NA NA NA 285 137
Cough variant asthma 313236 Cough variant asthma 44833611 49382 condition 2 2
Intrinsic asthma without status asthmaticus 252658 Intrinsic asthma, unspecified 44824289 49310 condition 24 11
Acute severe exacerbation of immunoglobin E-mediated allergic asthma 45769442 Extrinsic asthma with status asthmaticus 44824288 49301 condition 8 8
Chronic asthmatic bronchitis 256448 Chronic obstructive asthma with status asthmaticus 44820889 49321 condition 4 4
Acute severe exacerbation of intrinsic asthma 45769443 Intrinsic asthma with status asthmaticus 44834769 49311 condition 3 3
Acute severe exacerbation of asthma 45769438 Asthma, unspecified type, with status asthmaticus 44832424 49391 condition 3 3
Asthmatic pulmonary eosinophilia 252942 Pulmonary eosinophilia 44827830 5183 condition 34 22
Asthma 317009 Asthma, unspecified type, unspecified 44821988 49390 condition 118 85
Exercise-induced asthma 443801 Exercise induced bronchospasm 44826678 49381 condition 2 2
IgE-mediated allergic asthma 312950 Extrinsic asthma, unspecified 44824287 49300 condition 19 17
Acute exacerbation of allergic asthma 45769441 Extrinsic asthma with (acute) exacerbation 44831278 49302 condition 3 3
Acute exacerbation of chronic asthmatic bronchitis 40481763 Chronic obstructive asthma with (acute) exacerbation 44837136 49322 condition 10 10
Allergic bronchopulmonary aspergillosis 257583 Allergic bronchopulmonary aspergillosis 44831284 5186 condition 5 5
Chronic asthmatic bronchitis 256448 Chronic obstructive asthma, unspecified 44831280 49320 condition 25 23
Acute exacerbation of intrinsic asthma 45773005 Intrinsic asthma with (acute) exacerbation 44837135 49312 condition 6 6
Exacerbation of asthma 257581 Asthma, unspecified type, with (acute) exacerbation 44823144 49392 condition 19 17

3.3 Identify Orphan Codes

  • Orphan codes are concepts that might be related to our codelist but that have not been included.
  • It looks for descendants (via concept_descendants table), ancestors (via concept_ancestor table), and concepts related to the codes included in the codelist (via concept_relationship)
  • ACHILLES tables are used.
orphan <- summariseOrphanCodes(asthma, 
                               cdm)

3.3 Identify Orphan Codes

Database name
Eunomia Synpuf
Codelist name Domain ID Standard concept name Standard concept ID Standard concept Vocabulary ID
Estimate name
Record count Person count
asthma condition Chronic obstructive lung disease 255573 standard SNOMED 1,222 457
Chronic bronchitis 255841 standard SNOMED 97 79
Bronchitis 256451 standard SNOMED 146 104
Acute exacerbation of chronic obstructive airways disease 257004 standard SNOMED 327 155
Allergic rhinitis 257007 standard SNOMED 370 232
Wheezing 314754 standard SNOMED 81 64
Disorder of respiratory system 320136 standard SNOMED 79 75
Respiratory finding 4024567 standard SNOMED 5 5
At risk - finding 4085075 standard SNOMED 26 25
Disorder of the nose 4229909 standard SNOMED 40 36
Disorder characterized by eosinophilia 4302954 standard SNOMED 8 7
Disorder due to infection 432250 standard SNOMED 1 1
Poisoning by drug AND/OR medicinal substance 438028 standard SNOMED 17 13
Left heart failure 439846 standard SNOMED 109 72
Disorder of immune function 440371 standard SNOMED 1 1
Poisoning 442562 standard SNOMED 1 1

Phenotype R

CodelistGenerator