Collagen Screening: The End of an Era

We have decided to stop the use of analysis of procollagens produced by cultured fibroblasts for the diagnosis of osteogenesis imperfecta and some forms of Ehlers Danlos syndrome.  While this has been a useful diagnostic tool for many years, our ability to rapidly sequence genomic DNA using massively parallel Next-Generation sequencing has proven to be an analytically superior and more cost-effective way to diagnose forms of osteogenesis imperfect and Ehlers-Danlos syndrome.

Osteogenesis imperfecta (OI)

When, about 3 decades ago, we launched the precursor of the Collagen Diagnostic Laboratory the only tool that we had to confirm the diagnosis of osteogenesis imperfect (OI) and some forms of Ehlers-Danlos syndrome was the analysis of the procollagens produced by cultured dermal fibroblasts.  At first, we accepted only fibroblasts but within a fairly short time as we created a real tissue culture facility, we could accept skin biopsies, grow the cells and do the studies.  The studies were initially simple.  We plated cells at a confluent density, 250,000 cells per 35mm dish, allowed them to attach and spread overnight, and then added ascorbate to the medium as a cofactor for the required prolyl and lysyl hydroxylases, and then added [3H]proline and let the cells incubate overnight.  We then harvested the cells and the culture medium and ran the labeled proteins out by SDS-PAGE, soaked the gels in a fluor, and exposed the dried gel to x-ray film.  Proline is a major amino acid in collagens and makes up about 20% of the amino acids in the in the triple helical domain (in most proteins proline comprises only 2-3% of the amino acids).

In part because of the abundance of proline and because of a purification step by alcohol precipitation, only a few proteins are seen in the high molecular weight range—fibronectin and the proα chains of type I and type III procollagen and their partially processed products.  We recognized that cells from people with OI type I produced less type I procollagen than the control cells but that the electrophoretic mobility of the constituent proα1(I) and proα2(I) chains was normal.  Cells from the more severely affected individuals with OI type II, OI type III, or OI type IV produced some type I procollagen the constitutent chains of which had normal electrophoretic mobilities and others in which the chains had a slow mobility.  This shift in electrophoretic mobility was the consequence of increased post-translational modification –mostly lysyl hydroxylation and glycosylation of hydroxylysyl residues within the triple helical domain.  We thought that the cause of the increased modification was likely to be delayed winding of the triple helix, a process that occurs only once the proα chains have been completed and then assemble through carboxy-terminal recognition sites, regions that are in the carboxy-propeptide that extends beyond the triple helical region and contains about 250 residues in each chain.  It soon became clear that substitutions for glycine residues in the triple helical domain (which consists of a repeating Gly-X-Y repeating motif that is uninterrupted for 1014 amino acids) could delay folding of the triple helical domain and results in overmodification amino-terminal to the site of substitution.  In fact, any alteration in the native triple helical domain—in frame deletions or insertions that resulted from several mechanisms had the same result.

We could thus distinguish between cells from people with OI type I and those from other types of OI but we could not reliably predict, in a blinded fashion, if the cells came from an individual with the perinatal lethal form of OI or from someone with a milder form.  As a result we had to rely on detailed clinical information, radiographs, and discussions with the clinicians to assist in coming to a clear diagnosis.  In all these instances we could be reasonably sure that the biochemical alterations we saw resulted from dominant mutations in one of the type I collagen genes but even then we weren’t sure which gene had the mutation except for those with OI type I in which the reasonable assumption was that the mutation was in COL1A1, which encoded proα1(I) chains.

By the late 1980s we had found the first of the mutations that led to substitution for a glycine residue in the triple helical domain and could be sure that this class of mutation could explain at least some of the changes.  In short order we found that exon skipping mutations as well as single amino acid deletions and some shorter in-frame deletions or duplications that maintained the triple helical register had similar biochemical effects.  We also began to recognize that there were some individuals with clear cut diagnoses of OI in whose cells we could not detect abnormalities in type I collagen production or electrophoretic mobility of the constituent chains.

By the early 1990s we started a routine of sequence analysis using gel-based di-deoxy chain termination sequence analysis.  In retrospect, all this looks a little primitive in that gel reads were reliable for about 200-250 nucleotides and the process was quite labor intensive and used radiolabeled nucleotides.  The introduction of capillary based sequence analysis and the use of multi-channel analyzers dramatically changed the efficiency of the process and allowed us to reliably sequence both genes simultaneously and design highly reproducible strategies to see essentially all of the mutations that would alter protein sequence or production.  With this we recognized that mutations that altered the amino acid sequences in the first 100-200 residues of the triple helical domain might not alter the electrophoretic mobility of the chains and explained why we could not identify alterations in cells from some individuals.

Although somewhat more expensive to perform because of the cost of reagents and the capital costs of the equipment, sequence analysis was clearly more informative than protein studies.  It yielded a discrete answer in more than 95% of people with OI (compared to about 87% achieved by protein studies), identified the gene in which the mutations occurred and provided the basis for simple and cheap analysis of others in the family, the ability to identify parental mosaicism, and the ability to do prenatal diagnosis and pre-implantation diagnosis.  It also transferred the process from something of an art form involved in reading the electrophoretic protein gels, to a qualitative digital form of sequence analysis.

This change also aligns us with the recommendations of multi-laboratory task force that assessed the current strategies by which to confirm the diagnosis of OI and backed the proposal that the first tier of analysis would be analysis of the sequence of the type I collagen genes.  This was expected to identify mutations in about 95% of individuals with a clinical diagnosis of OI.  The next step, if no mutation was identified, was to reassess the clinical findings and if the clinical diagnosis was consistent, then to proceed to analysis of the genes known to produce other forms of OI.  This approach could be stratified if, for example, OI type V appeared on clinical grounds to be the most likely diagnosis.  If not, then all genes would be sequenced concurrently.  If that failed to produce a result, deletion/duplication of the type I collagen genes and analysis of type I collagen production by cultured cells could be used to determine if deep intronic mutations could alter splicing or deletion or duplicaton of short segments that could not be readily detected by sequence analysis.  If all these strategies failed, then the family could be transferred to research protocols with the appropriate consents.

Ehlers Danlos syndrome (EDS)

EDS type IV, the vascular type

In addition to OI, analysis of collagens has been a useful tool to confirm the diagnosis of some forms of EDS.  EDS type IV, also known as the vascular type, results from mutations in COL3A1that encodes the chains of type III procollagen.  The distribution of the types of mutations differs somewhat from those seen in COL1A1 such that about 65% result in substitution of glycine residues in the triple helical domain, about 25% affect splice sites, and a small number alter sequences in the carboxyl-terminal propeptide.  All told about just less than 5% of the people with EDS type IV in whom we have identified mutations have alterations that result in mRNA instability and production of about half the normal amount of type III procollagen.  Like mutations in type I collagen genes, those in COL3A1 that change the sequence of the protein by substitution for glycine residues in the triple helical domain or result in in-frame deletion or duplication result in slow folding, overmodifcation, and a delay in secretion.  Careful comparison of chain mobility and the distribution of type III pro/collagen between the medium and the intracellular compartment has proved to be an effective tool, but interpretation is an art for which the talent increases with experience.  We are less able to detect alterations in the amount of type III procollagen produced if there is heterozygosity for mutations that alter mRNA instability, in large part because the total amount of type III procollagen is significantly less that for type I procollagen.

EDS type VII

EDS type VII, the arthochalasis forms and the dermatosparaxic form, results from alterations in the cleavage of the amino-terminal propeptide from the major triple helical domain.  In the arthochalasis forms, the defect usually is a consequence of a mutation that results skipping the exon that contains the cleavage site in either COL1A1 or COL1A2.  In the recessive dermatosparaxis form, the defect is in the enzyme that achieves the cleavage and is recessively inherited.  To detect these alterations it is necessary to incubate the cells in the presence of dextran sulfate which reduces the extracellular space in which the substrate and enzyme interact and drives the reactions.  If this diagnosis is not suspected, then the routine ways to perform diagnostic studies does not identify many of the cells in which there are defects and will not identify any of these with recessive disorders.  In this context, directed sequence analysis of the splice sites that surround exon 6 of both COL1A1 and COL1A2 is an effective way to identify more than 90% of the known mutations.  If a mutation is not identified but the clinical picture is convincing, then analysis of cultured cells would provide an alternative.  Direct analysis of theADAMTS2 gene is the most effect method by which to identify individuals with that diagnosis.

EDS type I/II, the classical type

Type V collagen is a minor product of cultured dermal fibroblasts and mutations in COL5A1 andCOL5A2 account for well over 90% of individuals with EDS type I and II, the “classical” type of EDS.  For reasons that are not clear, most labs, including our own, cannot detect alterations in the amount of type V collagen produced or changes in electrophoretic mobility in a reliable fashion.  As a consequence, by default, sequence analysis of these two genes has become the principal diagnostic approach.

EDS type III

EDS type III, the hypermobility type, remains a challenge both from the clinical perspective and from the molecular diagnostic approach.  At this point there is no clear pathway to molecular diagnosis.

EDS type VI

EDS type VI, the kyphoscoliotic type, is a recessively inherited disorder that results from mutations in PLOD1 that encodes lysyl hydroxylase 1, an enzyme that modifies a subset of lysyl residues in the Y-position location in the triple helical domain of type I procollagen.  Bi-allelic mutations result in “undermodification” of the chains of type I procollagen and, as a result, a shift in electrophoretic mobility in the opposite direction of the “overmodified” chains.  This is variable and cannot be relied upon to confirm the diagnosis.  Analysis of complex cross links in urine is the cheapest and quickest path to confirmation of the diagnosis but detection of the mutations in both alleles is necessary if prenatal diagnosis is an objective.

Additional forms of recessively inherited EDS

In the last several years several rare recessively inherited forms of EDS have been identified, all of which depend on careful clinical delineation and analysis of the candidate genes to make the diagnosis.