RHealth Medical Code Module
R/medicine 2025 Prototype Showcase
Zhixia Ren
2025-05-25
Source:vignettes/MedCode.rmd
MedCode.rmdIntroduction
Goal: Showcase the core functionalities of the
RHealth::medcodemodule.Supports major medical coding systems (e.g., ICD, ATC) natively in R.
Built with inspiration from the Python-based
PyHealth.medcode.
-
Key Features to Demo:
Code Lookup
Hierarchy Navigation (Ancestors/Descendants)
Cross-System Mapping
ATC Specific Utilities
Data Handling
Download & Caching
The package automatically downloads required medical code datasets (CSVs) and caches them locally for faster and offline use.
- Uses
rappdirsto find a user-specific cache directory. - The
download_medcode()function checks the cache first; only downloads if the file is missing.
Let’s see where the data is/will be cached:
# Get the platform-specific cache directory path
cache_dir <- rappdirs::user_cache_dir("RHealth", "medcode")
print(paste("Data cache directory:", cache_dir))
Demonstrating download_medcode():
(This step ensures the data file exists locally, downloading only if necessary)
# Specify the dataset name (e.g., "ICD9CM")
dataset_name <- "ICD9CM"
# Call the function - returns path. Downloads ONLY if not cached.
# NOTE: For a live demo, ensure network access OR pre-cache the file!
# We assume it's pre-cached here.
file_path <- download_medcode(name = dataset_name)
print(paste("Path for", dataset_name, ":", file_path))
print(paste("File exists:", fs::file_exists(file_path)))Loading Data (load_medcode)
The load_medcode() function is used internally by most
other functions (lookup_code, get_ancestors,
etc.).
It first calls
download_medcode()to ensure the data file is available locally.Then, it reads the CSV file into an R data frame (tibble) using
readr::read_csv.
Example: Loading the ICD9CM data
# Load the data using the function
# This implicitly calls download_medcode first
icd9_data_example <- load_medcode("ICD9CM")
# Confirm data is loaded by checking dimensions and showing first few rows
print(paste("Loaded ICD9CM - Dimensions:", paste(dim(icd9_data_example), collapse = " x ")))
print("First few rows:")
kable(head(icd9_data_example))Feature 1: Code Lookup (lookup_code)
This function retrieves the description and potentially other details for a specific medical code within a given coding system.
Example: Look up ICD-9-CM code “428.0”
# Input: code string and system name
code_info <- lookup_code(code = "428.0", system = "ICD9CM")
# Output: A tibble/data frame row with information
# Using kable() for potentially nicer table output in the presentation
kable(code_info)Feature 2: Hierarchy - Ancestors (get_ancestors)
This function navigates the code hierarchy upwards to find all parent and ancestor codes.
Example: Find ancestors for ICD-9-CM code “428.22” (Systolic heart failure, acute on chronic)
# Input: code string and system name
ancestors <- get_ancestors(code = "428.22", system = "ICD9CM")
# Output: A character vector of ancestor codes
print(ancestors)Feature 3: Hierarchy - Descendants
(get_descendants)
This function navigates the code hierarchy downwards to find all child and descendant codes.
Example: Find descendants for ICD-9-CM code “428” (Heart failure)
Feature 4: Cross-System Mapping
Feature 4a: Supported Map
let’s see which mappings (crosswalks) are currently supported: The
supported_cross() function returns a list of identifiers
for all available code system mappings within the package.
# Call the function to get the list of supported crosswalks
available_crosswalks <- supported_cross()
# Print the available mapping identifiers
print(available_crosswalks)Feature 5: ATC Specific Functions
Now let’s look at utilities specifically designed for the ATC (Anatomical Therapeutic Chemical) classification system for drugs.
Feature 5a: ATC Level Conversion (atc_convert)
This utility function truncates an ATC code to get its representation at different classification levels (L1 to L5).
Example: Convert ATC code “L01BA01” (Methotrexate)
atc_code <- "L01BA01"
# Get code representations at different levels
print(paste("L1 (Anatomical Main Group):", atc_convert(atc_code, level = 1))) # L
print(paste("L3 (Therapeutic Subgroup):", atc_convert(atc_code, level = 3))) # L01
print(paste("L4 (Chemical Subgroup):", atc_convert(atc_code, level = 4))) # L01B
# Level 5 is usually the full substance code