Key Points
Developed a comprehensive genetic testing program using PacBio long-read sequencing for hemophilia diagnosis.
Successfully detected complex structure variants, improving diagnostic accuracy over traditional methods.
Hemophilia is an X-linked bleeding disorder caused by defects in the F8 or F9 genes. Given the wide variety of F8 variants, conventional genetic testing typically requires a combination of multiple methods, and detecting rearrangements in the intron 22 homologous regions (int22h) remains a challenging task. In this study, we developed a comprehensive hemophilia testing program using the PacBio long-read sequencing platform. Experimentally, we established a standard operating procedure for hybridization capture long-read sequencing (hc-LRS), which generates reads longer than 5 kb. Analytically, we employed a suite of bioinformatics tools to identify variants associated with hemophilia, including the detection of int22h-related rearrangements through de novo assembly of homologous haplotypes (DAHH). Our approach successfully identified pathogenic variants in hemophilia patients and carriers, encompassing both single-nucleotide variants (SNVs) and structural variations (SVs), with full concordance to validated methods. Moreover, the program identified complex int22h rearrangements in several samples, which were previously difficult to detect using traditional techniques. Compared to conventional methods, hc-LRS is more cost-effective, convenient, and capable of detecting various variants in a single test. This approach provides a powerful tool for the genetic diagnosis of hemophilia, particularly in patients with unknown genetic backgrounds or complex variants. In conclusion, our comprehensive testing program represents a significant advancement in the genetic diagnosis of hemophilia.