First--I devoured your book and loved it. In my (recent) training as a sociologist, the heritability/twins/GWAS work that I encountered (only peripherally) never quite made sense to me. "Stuck in my craw". I think I just didn't have a good mental model to understand them. Reading your book, I finally feel some lights coming on. So thank you.
Second, on this post and this paper: fantastic, it gives me a more concrete grasp on many points in the book, and what might be happening between GWAS and twin studies.
One point that you don't explicitly make, but I wonder about. The possibly massive interactivity of the whole system--among genes and between genes and life events--presents a dimensionality problem, a la the "curse of dimensionality". It may be that even if you could get every living human into a GWAS (or even, every human who has ever lived!), you would not have enough data to develop a good predictive model of many complex behavioral outcomes. And the problem is even worse than that, because as you add more people, I imagine the way outcomes themselves are defined will start to drift and blur. Ah, but maybe I'm just restating your "gloomy prospect"?
Thank you so much for the nice words about the book. Have you considered saying them more publicly? (I am not sure how many eyes cross the comments section of my Substack. Perhaps on Bluesky or Amazon? It would be much appreciated if you are comfortable doing it.
Anyway, I agree 100% with what you say. I think the dimensionality problem is one way to describe the gloomy prospect. It is also a good way to understand the relationship between human science and fascism. If a scientist is frustrated with the difficulty of conducting human science, what can he do? He can "reduce the dimensionality of the space" by compromising human freedom.
I don’t have much social media presence at the moment, but happy to leave something on amazon. Also will recommend to folks i know in a position to teach with it.
And aha! I see, a nice euphemism with many possible applications. “Dad, quit reducing the dimensionality of my space!”
Appreciate the shout out, Eric. It seems that there are a multitude of ways I could respond to this, but I’m not completely clear on your claim. It appears the Quincunx didn’t work (for the reasons I predicted when you brought it up on Twitter, as I recall). So as I understand it, you are saying it can’t predict phenotype from a heritability standpoint except if we have identical twins (or identical Quincunxes). I don’t know where your next installment is going with this, but I will try to preempt it by noting that the whole point of behavioral genetics was to examine high heritability of psychiatric diagnoses, personality traits and IQ. Yet, you are not successful in even developing a coherent mathematical model to explain this, even with some stochastic elements thrown in. In other words, you are negating the primary purpose of your field or at least not confirming it. So if the Quincunx model only works for identical twins, it seems like a failed model. If I say that non pathological genetic variants are not causal for psychiatric disorders, your Quincunx experiment doesn’t contradict that unless the Quincunx is identical. So again, we are back to identical twins being all that you got. Moreover, that’s only theoretical. The fact that identical twins are more alike and the results of identical quincunxes are more consistent, doesn’t prove that the Quincunx model is accurate and, in fact, is contradicted by the non identical Quincunx failure. Thus, my assertion that behavioral genetics is largely a null field remains on the table, 30 years running.
Hi Steve, as always my strongest objection is your insistence on addressing "behavior genetics" as some kind of unified entity, moreover one that I (of all people) am expected to justify and explain. It isn't and I can't.
You have been a practicing shrink for a long while. Does "family history" in the clinical sense of first degree relatives, never play a role in your diagnostic thinking? You don't notice that certain psychiatric problems occur again and again in some families? I sure do. Before you say it: I know there is no way in any particular family to figure out how much of the family history is "genetic" vs "environmental", but the point of the quincunx experiment is that it is reasonalble to expect genetic similarity to play a role, even for a complex phenotype that is just an emergent property of the whole organism.
I am just trying to explain the facts on the ground. Family members are similar, rMZ > rDZ, familial quantitative genetics finds substantial heritability, causal GWAS fails, GWAS heritability is non-zero but substantially smaller.
Here’s a thought experiment for you, Eric, based on years of clinical experience: we are told as psychiatrists that if a close family member is diagnosed with a particular disorder, we should be leaning in that direction when diagnosing a patient, so much so that if we were not to pick that diagnosis during our board practical exam, based on family history, we would likely fail. What do you think of that fact when you say diagnoses run in families? Can you see a self perpetuating situation? And you are correct, I have decades of clinical experience with the most severe patients you will ever see. In my experience, it is quite uncommon to have two siblings both diagnosed with schizophrenia or even a parent/offspring combination. However, if you are familiar with the practical realities of clinical work, you will also discover how wobbly a diagnosis like “bipolar disorder” really is. I would say that maybe 10 or 20 percent of the patients given that diagnosis actually have true, classic, manic symptoms. The rest are given the diagnosis based on “mood swings,” and are more realistically borderline personality disorder, or substance abuse issues. A lot of that has to do with childhood physical and sexual abuse. So when, for example, a woman tells me that her sister is bipolar and she also has mood swings and harms herself, my first thought is that both of them were sexually abused, generally by their father, rather than genetic commonalities. What does a psychiatrist do when confronted with that? Most of them diagnose the sister with “bipolar disorder” and start them on various medications. You might be interested to know that the bulk of the patients participating in studies of bipolar disorder are in this category. These are the things you see when you work clinically. So when I see a study claiming genetic correlations for “bipolar disorder,” I already know it is a flawed study before even looking at it, because the vast majority of the subjects don’t really even have classic bipolar disorder. It would benefit your field significantly to work with jaded clinicians once in a while instead of white-gloved academics who avoided clinical work their entire careers. Moreover, “behavioral genetics” is problematic even as term, as it tries to do an end around on the mind body problem. That is my issue with it. That encompasses most of the “phenotypes” you study.
The quincunx/galton board design completely neglects how environment additionally shapes traits. It merely serves as a simplified model for how a genetic/breeding value lands where it does, and reinforces (imo) that there isn't missing heritability, it was never there in the first place!
We talk about the "environment" part at the end of the chapter. I agree that calling these pins "genetic" is arbitrary. In fact they could be environmental events just as easily. As I have said many times (eg toward the end of the Spit for Science paper), my point is not that complex phenotypes are genetic or environmental; it is that complex phenoptypes can't be broken down into individual causes, especially additive ones. And I'm curious, is your last sentence about heritability never being there in the first place what you think, or is it the point of view you think I am falsely endorsing? Respect either way, but I note that you are a geneticist, and that would be a strong position for someone with that job title to take.
Thanks for the clarification. Upon reading the paper I do see that you covered the addition of environmental noise to the simulations, so that does assuage some concerns.
Regarding my comment, I don't believe that complex traits lack heritability, but rather than the framing of heritability being "missing" when comparing molecular genetics studies vs twin studies. My prior is leans strongly towards overestimates in twin studies, hence the gap is not going to be explained, hence "never there in the first place". I do concede that interactions may have important effects, that aren't consistently modeled in molecular genetics studies, but again, feel that they are unlikely to close the gap from twin estimates, due to the same issue as before.
Very interesting post, thanks! A couple of remarks though:
1/ Your hypothesis is testable as of today: get your hands on a GWAS dataset and run, say, a partial quadratic model (to keep things easy), a pathway-specific higher dimensionality model (to leverage biological insights) or a deep learning model (if there's enough data).
=> if your hypothesis is true, any of these approaches should yield significant uplifts vs linear models. (I wanted to do that years ago, but had no access to the data).
2/ Note however that if expression isn't linear, then this also affects twin studies. LInearity is a one of their core assumptions and, without it, their heritability estimates ought to be revised downward. Hence the missing heritability would be much smaller than conventionally admitted.
3/ Note also that high-dimensionality non-linear systems are highly stochastic: randomness plays a HUGE role in the outcome, which would put a ceiling on heritability.
=> this is your third law in full magnitude: we'd be in gloomy prospect territory.
1/ This is exactly what we are working on. Power to discriminate different non-linear models is very problematic, however.
2/Yes non-linearity affects twin studies, but in a bit of a strange way. MZ/DZ studies only have two points, so you can't detect non-linearity. Not sure about your second point: I think non-linearity inflates MZ correlations, which increase h^2.
3/ Agreed about the gloomy prospect. There is some good older work on the role of stochasticity in human development that I could look up if you are interested.
Following up on the second point as this concerns the mathematical core of your argument:
If expression is linear for trait A and nonlinear for trait B where both A and B have the same "real-life" true heritability, then:
- MZ will have the same concordance rate for A and B
- DZ will have a higher concordance for A than B
- in situation B, it's no longer valid to use Falconer's formula to estimate heritability from the empirically measured concordance rates. You need to revise the Falconer estimate *downward*.
To understand this intuitively, consider a purely quadratic expression mechanism, where phenotype is linear on the Cartesian square of the genome (that is, all pairs of genes) - DZ share 25% of this Cartesian square (assuming unlimited recombination).
Am I missing something here? Or am I getting it wrong? Because if not then it's a major aspect of the story: if the expression is substantially nonlinear for a trait, then twin studies substantially overestimate its heritability. This would resolve the Missing Heritability mystery, but the new heritability estimate won't land on the twin studies value: it'd be more like meeting halfway between twin studies and GWAS.
Eric,
First--I devoured your book and loved it. In my (recent) training as a sociologist, the heritability/twins/GWAS work that I encountered (only peripherally) never quite made sense to me. "Stuck in my craw". I think I just didn't have a good mental model to understand them. Reading your book, I finally feel some lights coming on. So thank you.
Second, on this post and this paper: fantastic, it gives me a more concrete grasp on many points in the book, and what might be happening between GWAS and twin studies.
One point that you don't explicitly make, but I wonder about. The possibly massive interactivity of the whole system--among genes and between genes and life events--presents a dimensionality problem, a la the "curse of dimensionality". It may be that even if you could get every living human into a GWAS (or even, every human who has ever lived!), you would not have enough data to develop a good predictive model of many complex behavioral outcomes. And the problem is even worse than that, because as you add more people, I imagine the way outcomes themselves are defined will start to drift and blur. Ah, but maybe I'm just restating your "gloomy prospect"?
Anyway, thanks again for your excellent work.
Noah
Thank you so much for the nice words about the book. Have you considered saying them more publicly? (I am not sure how many eyes cross the comments section of my Substack. Perhaps on Bluesky or Amazon? It would be much appreciated if you are comfortable doing it.
Anyway, I agree 100% with what you say. I think the dimensionality problem is one way to describe the gloomy prospect. It is also a good way to understand the relationship between human science and fascism. If a scientist is frustrated with the difficulty of conducting human science, what can he do? He can "reduce the dimensionality of the space" by compromising human freedom.
Thanks again.
I don’t have much social media presence at the moment, but happy to leave something on amazon. Also will recommend to folks i know in a position to teach with it.
And aha! I see, a nice euphemism with many possible applications. “Dad, quit reducing the dimensionality of my space!”
Appreciate the shout out, Eric. It seems that there are a multitude of ways I could respond to this, but I’m not completely clear on your claim. It appears the Quincunx didn’t work (for the reasons I predicted when you brought it up on Twitter, as I recall). So as I understand it, you are saying it can’t predict phenotype from a heritability standpoint except if we have identical twins (or identical Quincunxes). I don’t know where your next installment is going with this, but I will try to preempt it by noting that the whole point of behavioral genetics was to examine high heritability of psychiatric diagnoses, personality traits and IQ. Yet, you are not successful in even developing a coherent mathematical model to explain this, even with some stochastic elements thrown in. In other words, you are negating the primary purpose of your field or at least not confirming it. So if the Quincunx model only works for identical twins, it seems like a failed model. If I say that non pathological genetic variants are not causal for psychiatric disorders, your Quincunx experiment doesn’t contradict that unless the Quincunx is identical. So again, we are back to identical twins being all that you got. Moreover, that’s only theoretical. The fact that identical twins are more alike and the results of identical quincunxes are more consistent, doesn’t prove that the Quincunx model is accurate and, in fact, is contradicted by the non identical Quincunx failure. Thus, my assertion that behavioral genetics is largely a null field remains on the table, 30 years running.
Hi Steve, as always my strongest objection is your insistence on addressing "behavior genetics" as some kind of unified entity, moreover one that I (of all people) am expected to justify and explain. It isn't and I can't.
You have been a practicing shrink for a long while. Does "family history" in the clinical sense of first degree relatives, never play a role in your diagnostic thinking? You don't notice that certain psychiatric problems occur again and again in some families? I sure do. Before you say it: I know there is no way in any particular family to figure out how much of the family history is "genetic" vs "environmental", but the point of the quincunx experiment is that it is reasonalble to expect genetic similarity to play a role, even for a complex phenotype that is just an emergent property of the whole organism.
I am just trying to explain the facts on the ground. Family members are similar, rMZ > rDZ, familial quantitative genetics finds substantial heritability, causal GWAS fails, GWAS heritability is non-zero but substantially smaller.
Here’s a thought experiment for you, Eric, based on years of clinical experience: we are told as psychiatrists that if a close family member is diagnosed with a particular disorder, we should be leaning in that direction when diagnosing a patient, so much so that if we were not to pick that diagnosis during our board practical exam, based on family history, we would likely fail. What do you think of that fact when you say diagnoses run in families? Can you see a self perpetuating situation? And you are correct, I have decades of clinical experience with the most severe patients you will ever see. In my experience, it is quite uncommon to have two siblings both diagnosed with schizophrenia or even a parent/offspring combination. However, if you are familiar with the practical realities of clinical work, you will also discover how wobbly a diagnosis like “bipolar disorder” really is. I would say that maybe 10 or 20 percent of the patients given that diagnosis actually have true, classic, manic symptoms. The rest are given the diagnosis based on “mood swings,” and are more realistically borderline personality disorder, or substance abuse issues. A lot of that has to do with childhood physical and sexual abuse. So when, for example, a woman tells me that her sister is bipolar and she also has mood swings and harms herself, my first thought is that both of them were sexually abused, generally by their father, rather than genetic commonalities. What does a psychiatrist do when confronted with that? Most of them diagnose the sister with “bipolar disorder” and start them on various medications. You might be interested to know that the bulk of the patients participating in studies of bipolar disorder are in this category. These are the things you see when you work clinically. So when I see a study claiming genetic correlations for “bipolar disorder,” I already know it is a flawed study before even looking at it, because the vast majority of the subjects don’t really even have classic bipolar disorder. It would benefit your field significantly to work with jaded clinicians once in a while instead of white-gloved academics who avoided clinical work their entire careers. Moreover, “behavioral genetics” is problematic even as term, as it tries to do an end around on the mind body problem. That is my issue with it. That encompasses most of the “phenotypes” you study.
The quincunx/galton board design completely neglects how environment additionally shapes traits. It merely serves as a simplified model for how a genetic/breeding value lands where it does, and reinforces (imo) that there isn't missing heritability, it was never there in the first place!
We talk about the "environment" part at the end of the chapter. I agree that calling these pins "genetic" is arbitrary. In fact they could be environmental events just as easily. As I have said many times (eg toward the end of the Spit for Science paper), my point is not that complex phenotypes are genetic or environmental; it is that complex phenoptypes can't be broken down into individual causes, especially additive ones. And I'm curious, is your last sentence about heritability never being there in the first place what you think, or is it the point of view you think I am falsely endorsing? Respect either way, but I note that you are a geneticist, and that would be a strong position for someone with that job title to take.
Thanks for the clarification. Upon reading the paper I do see that you covered the addition of environmental noise to the simulations, so that does assuage some concerns.
Regarding my comment, I don't believe that complex traits lack heritability, but rather than the framing of heritability being "missing" when comparing molecular genetics studies vs twin studies. My prior is leans strongly towards overestimates in twin studies, hence the gap is not going to be explained, hence "never there in the first place". I do concede that interactions may have important effects, that aren't consistently modeled in molecular genetics studies, but again, feel that they are unlikely to close the gap from twin estimates, due to the same issue as before.
Very interesting post, thanks! A couple of remarks though:
1/ Your hypothesis is testable as of today: get your hands on a GWAS dataset and run, say, a partial quadratic model (to keep things easy), a pathway-specific higher dimensionality model (to leverage biological insights) or a deep learning model (if there's enough data).
=> if your hypothesis is true, any of these approaches should yield significant uplifts vs linear models. (I wanted to do that years ago, but had no access to the data).
2/ Note however that if expression isn't linear, then this also affects twin studies. LInearity is a one of their core assumptions and, without it, their heritability estimates ought to be revised downward. Hence the missing heritability would be much smaller than conventionally admitted.
3/ Note also that high-dimensionality non-linear systems are highly stochastic: randomness plays a HUGE role in the outcome, which would put a ceiling on heritability.
=> this is your third law in full magnitude: we'd be in gloomy prospect territory.
Notes:
1/ This is exactly what we are working on. Power to discriminate different non-linear models is very problematic, however.
2/Yes non-linearity affects twin studies, but in a bit of a strange way. MZ/DZ studies only have two points, so you can't detect non-linearity. Not sure about your second point: I think non-linearity inflates MZ correlations, which increase h^2.
3/ Agreed about the gloomy prospect. There is some good older work on the role of stochasticity in human development that I could look up if you are interested.
Thank you Eric!
Following up on the second point as this concerns the mathematical core of your argument:
If expression is linear for trait A and nonlinear for trait B where both A and B have the same "real-life" true heritability, then:
- MZ will have the same concordance rate for A and B
- DZ will have a higher concordance for A than B
- in situation B, it's no longer valid to use Falconer's formula to estimate heritability from the empirically measured concordance rates. You need to revise the Falconer estimate *downward*.
To understand this intuitively, consider a purely quadratic expression mechanism, where phenotype is linear on the Cartesian square of the genome (that is, all pairs of genes) - DZ share 25% of this Cartesian square (assuming unlimited recombination).
Am I missing something here? Or am I getting it wrong? Because if not then it's a major aspect of the story: if the expression is substantially nonlinear for a trait, then twin studies substantially overestimate its heritability. This would resolve the Missing Heritability mystery, but the new heritability estimate won't land on the twin studies value: it'd be more like meeting halfway between twin studies and GWAS.