Skip to contents

Introduction

  • Goal: Showcase the core functionalities of the DeepRHealth::medcode module.

  • Supports major medical coding systems (e.g., ICD, ATC) natively in R.

  • Built with inspiration from the Python-based PyHealth.medcode.

  • Key Features to Demo:
    • Code Lookup

    • Hierarchy Navigation (Ancestors/Descendants)

    • Cross-System Mapping

    • ATC Specific Utilities

Environment Setup

Loading Package & Dependencies

  • We’ll load the package directly from source using devtools::load_all().
  • This requires running the presentation from the package’s root directory.
  • We also load other necessary packages like knitr.
# Load required libraries for loading and display
library(devtools)
library(knitr)
library(rappdirs) # Used by DeepRHealth internally for caching
library(fs) # Used by DeepRHealth internally for caching

# Load all functions, data, etc. from the DeepRHealth package source
# IMPORTANT: Ensure R's working directory is the package root!
load_all(".")

# Optional: Display where medcode data would be cached
# cache_dir <- user_cache_dir("DeepRHealth", "medcode")
# print(paste("Data cache directory:", cache_dir))

# NOTE: Assume necessary data (e.g., ICD9CM.csv) is pre-cached for demo speed.

Data Handling

Download & Caching

The package automatically downloads required medical code datasets (CSVs) and caches them locally for faster and offline use.

  • Uses rappdirs to find a user-specific cache directory.
  • The download_medcode() function checks the cache first; only downloads if the file is missing.

Let’s see where the data is/will be cached:

# Get the platform-specific cache directory path
cache_dir <- rappdirs::user_cache_dir("DeepRHealth", "medcode")
print(paste("Data cache directory:", cache_dir))
## [1] "Data cache directory: ~/.cache/DeepRHealth"

Demonstrating download_medcode():

(This step ensures the data file exists locally, downloading only if necessary)

# Specify the dataset name (e.g., "ICD9CM")
dataset_name <- "ICD9CM"

# Call the function - returns path. Downloads ONLY if not cached.
# NOTE: For a live demo, ensure network access OR pre-cache the file!
#       We assume it's pre-cached here.
file_path <- download_medcode(name = dataset_name)

print(paste("Path for", dataset_name, ":", file_path))
## [1] "Path for ICD9CM : ~/.cache/RHealth/medcode/ICD9CM.csv"
print(paste("File exists:", fs::file_exists(file_path)))
## [1] "File exists: TRUE"

Loading Data (load_medcode)

The load_medcode() function is used internally by most other functions (lookup_code, get_ancestors, etc.).

  1. It first calls download_medcode() to ensure the data file is available locally.

  2. Then, it reads the CSV file into an R data frame (tibble) using readr::read_csv.

Example: Loading the ICD9CM data

# Load the data using the function
# This implicitly calls download_medcode first
icd9_data_example <- load_medcode("ICD9CM")

# Confirm data is loaded by checking dimensions and showing first few rows
print(paste("Loaded ICD9CM - Dimensions:", paste(dim(icd9_data_example), collapse = " x ")))
## [1] "Loaded ICD9CM - Dimensions: 17736 x 3"
print("First few rows:")
## [1] "First few rows:"
kable(head(icd9_data_example))
code parent_code name
806.11 806.1 Open fracture of C1-C4 level with complete lesion of cord
642.41 642.4 Mild or unspecified pre-eclampsia, delivered, with or without mention of antepartum condition
647.13 647.1 Gonorrhea of mother, complicating pregnancy, childbirth, or the puerperium, antepartum condition or complication
374.21 374.2 Paralytic lagophthalmos
679.00 679.0 Maternal complications from in utero procedure, unspecified as to episode of care or not applicable
013.41 013.4 Tuberculoma of spinal cord, bacteriological or histological examination not done

Feature 1: Code Lookup (lookup_code)

This function retrieves the description and potentially other details for a specific medical code within a given coding system.

Example: Look up ICD-9-CM code “428.0”

# Input: code string and system name
code_info <- lookup_code(code = "428.0", system = "ICD9CM")

# Output: A tibble/data frame row with information
# Using kable() for potentially nicer table output in the presentation
kable(code_info)
code parent_code name
428.0 428 Congestive heart failure, unspecified

Feature 2: Hierarchy - Ancestors (get_ancestors)

This function navigates the code hierarchy upwards to find all parent and ancestor codes.

Example: Find ancestors for ICD-9-CM code “428.22” (Systolic heart failure, acute on chronic)

# Input: code string and system name
ancestors <- get_ancestors(code = "428.22", system = "ICD9CM")

# Output: A character vector of ancestor codes
print(ancestors)
##       428.22        428.2          428   420-429.99   390-459.99 
##      "428.2"        "428" "420-429.99" "390-459.99" "001-999.99"

Feature 3: Hierarchy - Descendants (get_descendants)

This function navigates the code hierarchy downwards to find all child and descendant codes.

Example: Find descendants for ICD-9-CM code “428” (Heart failure)

# Input: code string and system name
descendants <- get_descendants(code = "428", system = "ICD9CM")

# Output: A character vector of descendant codes (can be long!)
print(head(descendants)) # Show only the first few for brevity
## [1] "428.4" "428.1" "428.0" "428.3" "428.2" "428.9"
print(paste("Total descendants found:", length(descendants)))
## [1] "Total descendants found: 18"

Feature 4: Cross-System Mapping

Feature 4a: Supported Map

let’s see which mappings (crosswalks) are currently supported: The supported_cross() function returns a list of identifiers for all available code system mappings within the package.

# Call the function to get the list of supported crosswalks
available_crosswalks <- supported_cross()

# Print the available mapping identifiers
print(available_crosswalks)
## [1] "ICD9CM_to_CCSCM"      "ICD9PROC_to_CCSPROC"  "ICD10CM_to_CCSCM"    
## [4] "ICD10PROC_to_CCSPROC" "NDC_to_ATC"

Feature 4b:Translate codes(map_code)

This function translates codes from one coding system to another using pre-defined mapping tables.

Example: Map ICD-9-CM “428.0” to the CCSCM system (Clinical Classifications Software)

# Input: code, source system (from), target system (to)
mapped_code <- map_code(code = "428.0", from = "ICD9CM", to = "CCSCM")

# Output: The corresponding code(s) in the target system
print(mapped_code)
## [1] "108"

Feature 5: ATC Specific Functions

Now let’s look at utilities specifically designed for the ATC (Anatomical Therapeutic Chemical) classification system for drugs.

Feature 5a: ATC Level Conversion (atc_convert)

This utility function truncates an ATC code to get its representation at different classification levels (L1 to L5).

Example: Convert ATC code “L01BA01” (Methotrexate)

atc_code <- "L01BA01"

# Get code representations at different levels
print(paste("L1 (Anatomical Main Group):", atc_convert(atc_code, level = 1))) # L
## [1] "L1 (Anatomical Main Group): L"
print(paste("L3 (Therapeutic Subgroup):", atc_convert(atc_code, level = 3))) # L01
## [1] "L3 (Therapeutic Subgroup): L01"
print(paste("L4 (Chemical Subgroup):", atc_convert(atc_code, level = 4))) # L01B
## [1] "L4 (Chemical Subgroup): L01B"
# Level 5 is usually the full substance code

Feature 5b: ATC Drug-Drug Interactions (get_ddi)

This function loads a predefined dataset of potential Drug-Drug Interactions (DDIs), typically represented by pairs of interacting ATC codes.

Example: Load and view the first few DDI pairs

# Load the DDI dataset bundled with the package/data
ddi_data <- get_ddi()

# Display the structure (first few rows)
# Assumes columns are ATC_i, ATC_j
kable(head(ddi_data))
ATC_i ATC_j
S01AA19 N01AH01
S01AA19 N02AB03
J01CA01 N01AH01
J01CA01 N02AB03
N01AB08 R03DA05
J01XX08 J01DH03
# Optionally, show total number of interactions listed
# print(paste("Total DDI records loaded:", nrow(ddi_data)))

Summary & Future Work

Summary

  • medcode enables standardized handling of clinical code vocabularies
  • Unified functions for code lookup, hierarchy navigation, and cross-system mapping
  • Easily extendable to support new coding systems or custom vocabularies
  • Ready for integration into R-based medical informatics pipelines

Next step/Future work

  • Finalize Vignettes: Complete detailed tutorials (vignettes) showcasing common use cases and workflows.
  • Complete pkgdown Website: Finish the documentation website for easy access to all documentation (including existing help pages) and examples.
  • Expand Coverage: Add support for more coding systems and mapping tables based on user needs.
  • Implement Testing: Create comprehensive unit tests using testthat to ensure code quality and stability.