Recent breakthroughs in genetic research may have uncovered new genes underlying common psychiatric disorders. Schizophrenia and bipolar disorder affect more than 64 million people around the world. These conditions are strongly influenced by genetics. However, there is no single gene that determines a person’s risk of developing schizophrenia or bipolar disorder. Rather, it is likely that a large number of genes contribute to the risk. Using artificial intelligence, researchers at Stanford University have now discovered complex variants in the human genome that may contribute to these psychiatric disorders. This new study suggests that mutations that occur after conception, such as genetic mosaicism, may be responsible for a number of psychiatric disorders, including bipolar disorder and schizophrenia.
Think of a genome as a living book with instructions for every cell in the body. Our genes are the chapters. We have about 20,000 genes that provide instructions for making proteins, the building blocks of life. However, the vast majority of our genes are non-coding, meaning they do not provide instructions for proteins. Nevertheless, these genes play an important role in genetics and regulating cell function.
Genetic variants or spelling changes in a coding or non-coding region can interfere with the way the cell translates specific instructions. A small typo may have little to no impact on how the book is read. However, larger spelling changes may result in the deletion of a sentence or even an entire chapter. Without the proper instructions to produce specific proteins, these spelling changes can contribute to conditions that affect various aspects of our bodies.
Our genes are a combination of the DNA we inherit from our parents. We have two copies of each gene, one from mother and one from father. These randomly assorted gene pairs determine traits such as hair texture, eye color and even some health risks. Some traits are dominant, meaning that only one copy of the variant is needed for expression. Others are recessive and only appear when both copies are the same. This is called Mendelian inheritance, named after Dr. Gregor Mende on how genes are passed on in pea plants.
In the earliest stages of life, DNA undergoes multiple rounds of replication. Trillions of cell divisions take place, with one cell splitting into two identical daughter cells. However, DNA replication is prone to errors. Every time a cell divides, small spelling mistakes are created in the genome. Rapid replication during the first trimester of pregnancy can therefore introduce a large number of genetic changes that do not occur in either father or mother. This is known as genetic mosaicism, where two or more genetically different cell populations are expressed in the body. Mosaicism can appear as two different eye colors, or alternating skin patterns, as shown below. A number of conditions have also been linked to mosaicism, such as developmental delays, autism, epilepsy and some cancers. We all have some degree of genetic mosaicism in our bodies. This is why identical twins can have different fingerprints.
Genetic variants can also be acquired throughout an individual’s lifespan, further changing the mosaic of our genome. Changes in DNA can result from exposure to chemicals or radiation, or from infections such as hepatitis B and C that affect the genetic material in a host cell. Other variants are obtained randomly. DNA can develop errors during replication and other normal cell functions. This damage is exacerbated by inflammation, aging, and lifestyle choices such as smoking and poor diet. Finding out which variants contribute to certain conditions can therefore sometimes be a very complex process.
Whole genome sequencing (WGS) can help identify small changes in DNA. This genetic test maps an individual’s entire genome using samples collected from blood or swabs. Whole genome sequencing extracts the exact sequences that comprise each chapter of our DNA. The extracted sequences are then compared to reference genes from a typical human genome. Any difference between an individual’s genome and the reference genome reveals a possible variant that could be associated with a condition.
Alexander Urban, senior author of this study and associate professor at Stanford, describes: “Looking for only simple variations is like proofreading a book manuscript and looking only for typos that change individual letters. You overlook words that are distorted, duplicated, or in the wrong order; you may even miss that half a chapter is gone.” Certain disorders may actually be linked to long-term, complex spelling changes in an individual’s genes. It is further complicated by the fact that variants of different genes can overlap with more than one condition.
Many psychiatric disorders are affected by multiple changes in similar genes. Bipolar disorder and schizophrenia are prime examples of the complexity of the human genome. Hundreds of genetic variants have been identified that contribute to the risk. Many of these genes are linked to brain development, immune system regulation, and neuron signaling pathways. The AKAP11 gene in particular appears to be a strong risk factor for bipolar disorder, although recent studies in mice suggest that this gene may also be involved in schizophrenia. Understanding how spelling changes in this gene interact with other high-risk variants could help decipher what triggers the onset of psychiatric symptoms.
In their study, Zhou et. already compared the genomes of more than 4,000 individuals around the world. Their complete DNA sequence was extracted using whole genome sequencing. The data was then uploaded to an AI algorithm trained to recognize dozens of genomes from different ancestors. This approach allowed researchers to match large, complex gene variants with specific health problems.
The study specifically recruited individuals with known diagnoses of bipolar disorder or schizophrenia and compared them to healthy controls. This type of approach is known as a genome-wide association study (GWAS). Genome-wide association studies compare the genes of individuals with a particular disease with a large cohort of controls. While this approach can tell us where variants are located, this information is often not accurate. For example, it can tell us that the book contains spelling changes on pages 122, 296, and 731, but not what types of errors they are. The AI algorithm developed by Zhou et. though adds more specificity. It highlights the changed word or phrase and reports whether it has been scrambled, duplicated or deleted.
With an accuracy of more than 85%, the AI tool identified more than 8,000 complex variants. Many of these spelling changes were found in parts of the genome that provide instructions for brain function. To determine whether these variants could be linked to psychiatric disorders, they extracted DNA from brain tissue samples of individuals suffering from schizophrenia or bipolar disorder. The complex variants they identified appeared to overlap with single variants found in other genome-wide association studies of these conditions. For example, one complex variant they found linked to schizophrenia and bipolar disorder was the length of 4,700 base pairs, the basic unit of DNA. In the book’s analogy, base pairs resemble the words in the book.
New innovations in genetic research deepen our understanding of the human genome. By analyzing vast amounts of genetic data, AI technology is uncovering complex relationships between major variants and certain psychiatric disorders. This not only increases our understanding of the genetic basis of these conditions, but also paves the way for personalized medicine. As we uncover more of the human genome, future studies could reveal deeper insights into the genetic underpinnings of a range of conditions.