HLA Serolizer

Prediction of serological specificity for novel HLA alleles using the random forest machine learning models from the HLA Dictionary 2022

Quick Start: Select HLA Locus and enter mature protein
sequence (pre-aligned to reference sequence) for serology prediction. The current sequence in the text box is a recombinant allele, A*24:24, in place as an example.

Select HLA Locus

Sequence for Serology Prediction



This web tool is based on the random forest machine learning models built for serologic specificity prediction for the HLA Dictionary 2022 project. For that project, these models were used to predict serological specificity of alleles that were previously uncharacterized by serological methods. Here, we have made the models available for use on any novel HLA sequence.
For each HLA locus, there exists an independent random forest model for each serological specificity. When a sequence is entered into this tool, it is encoded and limited to positions that were determined to be "important" by the random forest mdoel (based on Gini impurity). The random forest then outputs a probability that the sequence matches the serological label in question. This is performed for each serological specificity of a locus.
As part of the HLA Dictionary 2022, we considered a 0.42 threshold as a positive call based on accuracy assessments of various thresholds. To allow users a more complete picture, we have included the prediction probability for each serological label as output from this tool.


Contact Us

For scientific and technical queries contact pathologygragertlab@tulane.edu.

Cite HLA Dictionary Serology Prediction tool

Manuscript in preparation

Prototype Tool for Research Use Only