DeepRHealth::medcode Module Demo
R/medicine 2025 Prototype Showcase
Zhixia Ren
05/25/2025
Source:vignettes/MedCode.rmd
MedCode.rmdIntroduction
Goal: Showcase the core functionalities of the
DeepRHealth::medcodemodule.Supports major medical coding systems (e.g., ICD, ATC) natively in R.
Built with inspiration from the Python-based
PyHealth.medcode.
-
Key Features to Demo:
Code Lookup
Hierarchy Navigation (Ancestors/Descendants)
Cross-System Mapping
ATC Specific Utilities
Environment Setup
Loading Package & Dependencies
- We’ll load the package directly from source using
devtools::load_all(). - This requires running the presentation from the package’s root directory.
- We also load other necessary packages like
knitr.
# Load required libraries for loading and display
library(devtools)
library(knitr)
library(rappdirs) # Used by DeepRHealth internally for caching
library(fs) # Used by DeepRHealth internally for caching
# Load all functions, data, etc. from the DeepRHealth package source
# IMPORTANT: Ensure R's working directory is the package root!
load_all(".")
# Optional: Display where medcode data would be cached
# cache_dir <- user_cache_dir("DeepRHealth", "medcode")
# print(paste("Data cache directory:", cache_dir))
# NOTE: Assume necessary data (e.g., ICD9CM.csv) is pre-cached for demo speed.Data Handling
Download & Caching
The package automatically downloads required medical code datasets (CSVs) and caches them locally for faster and offline use.
- Uses
rappdirsto find a user-specific cache directory. - The
download_medcode()function checks the cache first; only downloads if the file is missing.
Let’s see where the data is/will be cached:
# Get the platform-specific cache directory path
cache_dir <- rappdirs::user_cache_dir("DeepRHealth", "medcode")
print(paste("Data cache directory:", cache_dir))## [1] "Data cache directory: ~/.cache/DeepRHealth"
Demonstrating download_medcode():
(This step ensures the data file exists locally, downloading only if necessary)
# Specify the dataset name (e.g., "ICD9CM")
dataset_name <- "ICD9CM"
# Call the function - returns path. Downloads ONLY if not cached.
# NOTE: For a live demo, ensure network access OR pre-cache the file!
# We assume it's pre-cached here.
file_path <- download_medcode(name = dataset_name)
print(paste("Path for", dataset_name, ":", file_path))## [1] "Path for ICD9CM : ~/.cache/RHealth/medcode/ICD9CM.csv"
print(paste("File exists:", fs::file_exists(file_path)))## [1] "File exists: TRUE"
Loading Data (load_medcode)
The load_medcode() function is used internally by most
other functions (lookup_code, get_ancestors,
etc.).
It first calls
download_medcode()to ensure the data file is available locally.Then, it reads the CSV file into an R data frame (tibble) using
readr::read_csv.
Example: Loading the ICD9CM data
# Load the data using the function
# This implicitly calls download_medcode first
icd9_data_example <- load_medcode("ICD9CM")
# Confirm data is loaded by checking dimensions and showing first few rows
print(paste("Loaded ICD9CM - Dimensions:", paste(dim(icd9_data_example), collapse = " x ")))## [1] "Loaded ICD9CM - Dimensions: 17736 x 3"
print("First few rows:")## [1] "First few rows:"
| code | parent_code | name |
|---|---|---|
| 806.11 | 806.1 | Open fracture of C1-C4 level with complete lesion of cord |
| 642.41 | 642.4 | Mild or unspecified pre-eclampsia, delivered, with or without mention of antepartum condition |
| 647.13 | 647.1 | Gonorrhea of mother, complicating pregnancy, childbirth, or the puerperium, antepartum condition or complication |
| 374.21 | 374.2 | Paralytic lagophthalmos |
| 679.00 | 679.0 | Maternal complications from in utero procedure, unspecified as to episode of care or not applicable |
| 013.41 | 013.4 | Tuberculoma of spinal cord, bacteriological or histological examination not done |
Feature 1: Code Lookup (lookup_code)
This function retrieves the description and potentially other details for a specific medical code within a given coding system.
Example: Look up ICD-9-CM code “428.0”
# Input: code string and system name
code_info <- lookup_code(code = "428.0", system = "ICD9CM")
# Output: A tibble/data frame row with information
# Using kable() for potentially nicer table output in the presentation
kable(code_info)| code | parent_code | name |
|---|---|---|
| 428.0 | 428 | Congestive heart failure, unspecified |
Feature 2: Hierarchy - Ancestors (get_ancestors)
This function navigates the code hierarchy upwards to find all parent and ancestor codes.
Example: Find ancestors for ICD-9-CM code “428.22” (Systolic heart failure, acute on chronic)
# Input: code string and system name
ancestors <- get_ancestors(code = "428.22", system = "ICD9CM")
# Output: A character vector of ancestor codes
print(ancestors)## 428.22 428.2 428 420-429.99 390-459.99
## "428.2" "428" "420-429.99" "390-459.99" "001-999.99"
Feature 3: Hierarchy - Descendants
(get_descendants)
This function navigates the code hierarchy downwards to find all child and descendant codes.
Example: Find descendants for ICD-9-CM code “428” (Heart failure)
# Input: code string and system name
descendants <- get_descendants(code = "428", system = "ICD9CM")
# Output: A character vector of descendant codes (can be long!)
print(head(descendants)) # Show only the first few for brevity## [1] "428.4" "428.1" "428.0" "428.3" "428.2" "428.9"
## [1] "Total descendants found: 18"
Feature 4: Cross-System Mapping
Feature 4a: Supported Map
let’s see which mappings (crosswalks) are currently supported: The
supported_cross() function returns a list of identifiers
for all available code system mappings within the package.
# Call the function to get the list of supported crosswalks
available_crosswalks <- supported_cross()
# Print the available mapping identifiers
print(available_crosswalks)## [1] "ICD9CM_to_CCSCM" "ICD9PROC_to_CCSPROC" "ICD10CM_to_CCSCM"
## [4] "ICD10PROC_to_CCSPROC" "NDC_to_ATC" "ICD10CM_to_ICD9CM"
## [7] "ICD9CM_to_ICD10CM" "ICD10PCS_to_ICD9PCS" "ICD9PCS_to_ICD10PCS"
## [10] "ICD10CMPCS_to_ICD9CM"
Feature 4b:Translate codes(map_code)
This function translates codes from one coding system to another using pre-defined mapping tables.
Example: Map ICD-9-CM “428.0” to the CCSCM system (Clinical Classifications Software)
# Input: code, source system (from), target system (to)
mapped_code <- map_code(code = "428.0", from = "ICD9CM", to = "CCSCM")
# Output: The corresponding code(s) in the target system
print(mapped_code)## character(0)
Feature 5: ATC Specific Functions
Now let’s look at utilities specifically designed for the ATC (Anatomical Therapeutic Chemical) classification system for drugs.
Feature 5a: ATC Level Conversion (atc_convert)
This utility function truncates an ATC code to get its representation at different classification levels (L1 to L5).
Example: Convert ATC code “L01BA01” (Methotrexate)
atc_code <- "L01BA01"
# Get code representations at different levels
print(paste("L1 (Anatomical Main Group):", atc_convert(atc_code, level = 1))) # L## [1] "L1 (Anatomical Main Group): L"
print(paste("L3 (Therapeutic Subgroup):", atc_convert(atc_code, level = 3))) # L01## [1] "L3 (Therapeutic Subgroup): L01"
print(paste("L4 (Chemical Subgroup):", atc_convert(atc_code, level = 4))) # L01B## [1] "L4 (Chemical Subgroup): L01B"
# Level 5 is usually the full substance codeFeature 5b: ATC Drug-Drug Interactions (get_ddi)
This function loads a predefined dataset of potential Drug-Drug Interactions (DDIs), typically represented by pairs of interacting ATC codes.
Example: Load and view the first few DDI pairs
# Load the DDI dataset bundled with the package/data
ddi_data <- get_ddi()
# Display the structure (first few rows)
# Assumes columns are ATC_i, ATC_j
kable(head(ddi_data))| ATC_i | ATC_j |
|---|---|
| S01AA19 | N01AH01 |
| S01AA19 | N02AB03 |
| J01CA01 | N01AH01 |
| J01CA01 | N02AB03 |
| N01AB08 | R03DA05 |
| J01XX08 | J01DH03 |
# Optionally, show total number of interactions listed
# print(paste("Total DDI records loaded:", nrow(ddi_data)))Summary & Future Work
Summary
-
medcodeenables standardized handling of clinical code vocabularies - Unified functions for code lookup, hierarchy navigation, and cross-system mapping
- Easily extendable to support new coding systems or custom vocabularies
- Ready for integration into R-based medical informatics pipelines
Next step/Future work
- Finalize Vignettes: Complete detailed tutorials (vignettes) showcasing common use cases and workflows.
-
Complete
pkgdownWebsite: Finish the documentation website for easy access to all documentation (including existing help pages) and examples. - Expand Coverage: Add support for more coding systems and mapping tables based on user needs.
-
Implement Testing: Create comprehensive unit tests
using
testthatto ensure code quality and stability.