Feature processor for encoding categorical sequences (e.g., medical codes) into
numerical indices. Supports dynamic vocabulary construction.
Public fields
code_vocab
A named integer vector representing token-to-index mappings.
.next_index
The next available index for unseen tokens.
Methods
Inherited methods
Method new()
Initialize with default vocabulary for and .
Method process()
Process a sequence of tokens into a tensor of indices.
Usage
SequenceProcessor$process(value)
Arguments
value
A character vector of tokens.
Returns
A long-type tensor of indices.
Method size()
Return size of vocabulary.
Print summary.
Usage
SequenceProcessor$print(...)
Method clone()
The objects of this class are cloneable with this method.
Usage
SequenceProcessor$clone(deep = FALSE)
Arguments
deep
Whether to make a deep clone.