From: M. Taylor Saotome-Westlake Date: Thu, 26 Mar 2020 06:15:56 +0000 (-0700) Subject: Human Diversity: the meaning of test bias X-Git-Url: http://232903.hjopswx29.asia/source?a=commitdiff_plain;h=691f1fb3ff12fd01d27603902e8d05a000bc327c;p=Ultimately_Untrue_Thought.git Human Diversity: the meaning of test bias --- diff --git a/content/drafts/book-review-human-diversity.md b/content/drafts/book-review-human-diversity.md index 9f65825..d72b0ed 100644 --- a/content/drafts/book-review-human-diversity.md +++ b/content/drafts/book-review-human-diversity.md @@ -54,7 +54,7 @@ Then we can estimate the sizes of the _A_, _C_, and _E_ components by studying f Anyway, it turns out that the effect of the shared environment _C_ is way smaller than most people intuitively expect—next to zero for personality and adult intelligence. The environment matters—just not the part of the environment shared by sibling in the same family. Just not the part of the environment we know how to control. Thus, a lot of economic and class stratification actually ends up being along genetic lines: the nepotism of family wealth can buy opportunities and second chances, but it doesn't actually live your life for you. -It's important not to overinterpret the heritability results; there are a bunch of standard caveats that go here that everyone's treatment of the topic needs to include! Heritability is about the _variance_ in phenotypes that can be predicted by _variance_ in genes. This is _not_ the same concept as "controlled by genes." To see this, notice that the trait "number of heads" has a heritability of zero because the variance is zero: all living people have exactly one head. (Uh, I'm counting Siamese twins as two people.) Heritability estimates are also necessarily bound to a particular population in a particular place and time, which can face constraints shaped solely by the environment. If you plant half of a batch of seeds in the shade and half in the sun, the variance in the heights of the resulting plants will be associated with variance in genes _within_ each group, but the difference _between_ the groups is solely determined by the sunniness of their environments. Likewise, in a Society with a cruel caste system under which children with red hair are denied internet access, part of the heritability of intellectual achievement is going to come from alleles that code for red hair. Even though (_ex hypothesi_) redheads have the same inherent intellectual potential as everyone else, the heritability computation can't see into worlds that are not our own, which might have vastly different gene–environment correlations. +It's important not to overinterpret the heritability results; there are a bunch of standard caveats that go here that everyone's treatment of the topic needs to include! Heritability is about the _variance_ in phenotypes that can be predicted by _variance_ in genes. This is _not_ the same concept as "controlled by genes." To see this, notice that the trait "number of heads" has a heritability of zero because the variance is zero: all living people have exactly one head. (Siamese twins are two people.) Heritability estimates are also necessarily bound to a particular population in a particular place and time, which can face constraints shaped solely by the environment. If you plant half of a batch of seeds in the shade and half in the sun, the variance in the heights of the resulting plants will be associated with variance in genes _within_ each group, but the difference _between_ the groups is solely determined by the sunniness of their environments. Likewise, in a Society with a cruel caste system under which children with red hair are denied internet access, part of the heritability of intellectual achievement is going to come from alleles that code for red hair. Even though (_ex hypothesi_) redheads have the same inherent intellectual potential as everyone else, the heritability computation can't see into worlds that are not our own, which might have vastly different gene–environment correlations. Old-timey geneticists used to think that they would find small number of "genes for" something, but it turns out that we live in an omnigenetic, pleiotropic world where lots and lots of SNPs each exert a tiny effect on potentially lots and lots of things. I feel like this probably _shouldn't_ have been surprising (genes code for proteins, variation in what proteins get made is going to affect high-level behaviors, but high-level behaviors involve _lots_ of proteins in a super-complicated unpredictable way), but I guess it was. @@ -64,7 +64,9 @@ The starry-eyed view epitomized by Plomin says that polygenic scores are _super The curmudgeonly view epitomized by Turkheimer says that science is about understanding the _causal structure_ of phenomena, and that polygenic scores don't fucking tell us anything. [Divorce is heritable _in the same way_ that intelligence is heritable](http://www.geneticshumanagency.org/gha/the-ubiquity-problem-for-group-differences-in-behavior/), not because there are "divorce genes" in any meaningful biological sense, but because of a "universal, nonspecific genetic pull on everything." -Notably, Plomin and Turkheimer aren't actually disagreeing here: it's a difference in emphasis rather than facts. Polygenic scores _don't_ explain mechanisms—but might they end up being useful, and used, anyway? Murray's vision of social science is content to make predictions and "explain variance" while remaining ignorant of ultimate causality. Meanwhile, my cursory understanding (while kicking myself for [_still_](/2018/Dec/untitled-metablogging-26-december-2018/#daphne-koller-and-the-methods) not having put in the hours to get much farther into [_Probabilistic Graphical Models: Principles and Techniques_](https://mitpress.mit.edu/books/probabilistic-graphical-models)) was that you need to understand causality in order to predict what interventions will have what effects—maybe our feeble state of knowledge is _why_ we don't know how to find reliable large-effect environmental interventions that still yet might exist in the vastness of the space of possible interventions. +Notably, Plomin and Turkheimer aren't actually disagreeing here: it's a difference in emphasis rather than facts. Polygenic scores _don't_ explain mechanisms—but might they end up being useful, and used, anyway? Murray's vision of social science is content to make predictions and "explain variance" while remaining ignorant of ultimate causality. Meanwhile, my cursory understanding (while kicking myself for [_still_](/2018/Dec/untitled-metablogging-26-december-2018/#daphne-koller-and-the-methods) not having put in the hours to get much farther into [_Probabilistic Graphical Models: Principles and Techniques_](https://mitpress.mit.edu/books/probabilistic-graphical-models)) was that you need to understand causality in order to predict what interventions will have what effects [TODO: explain why with example] + +Maybe our feeble state of knowledge is _why_ we don't know how to find reliable large-effect environmental interventions that still yet might exist in the vastness of the space of possible interventions. There are also some appendicies at the back of the book! Appendix 1 (reproduced from, um, one of Murray's earlier books with a coauthor) explains some basic statistics concepts. Appendix 2 ("Sexual Dimorphism in Humans") goes over the prevalence of intersex conditions and gays, and then—so much for this post broadening the [topic scope of this blog](/tag/two-type-taxonomy/)—transgender typology! Murray presents the Blanchard–Bailey–Lawrence–Littman view as fact, which I think is basically _correct_, but a more comprehensive treatment (which I concede may be too much too hope for from a mere Appendix) would have at least _mentioned_ alternative views ([Serano](https://rationalwiki.org/wiki/Intrinsic_Inclinations_Model)? [Veale](/papers/veale-lomax-clarke-identity_defense_model.pdf)?), if only to explain _why_ they're worth dismissing. (Contrast to the eight pages in the main text explaining why "But, but, epigenetics!" is worth dismissing.) Then Appendix 3 ("Sex Differences in Brain Volumes and Variance") has tables of brain-size data, and an explanation of the greater-male-variance hypothesis. Cool! @@ -124,8 +126,8 @@ In 1994's _The Bell Curve: Intelligence and Class Structure in American Life_, M So Murray and Herrnstein talk about this "intelligence" thingy, and how it's heritable, and how it predicts income, school success, not being a criminal, _&c._, and how this has all sorts of implications for Society and inequality and class structure and stuff. -This _should_ just be more social-science nerd stuff, the sort of thing that would only draw your attention if, like me, you feel bad about not being smart enough to do algebraic topology and want to console yourself by at least knowing about the Science of not being smart enough to do algebraic topology. The reason everyone _and her dog_ is still mad at Charles Murray a quarter century later is Chapter 13, "Ethnic Differences in Cognitive Ability", and Chapter 14, "Ethnic Inequalities in Relation to IQ". So, _apparently_, different ethnic/"racial" groups have different average scores on IQ tests. [Ashkenazi Jews do the best](https://slatestarcodex.com/2017/05/26/the-atomic-bomb-considered-as-hungarian-high-school-science-fair-project/). (I sometimes privately joke that the fact that I'm [only 85% Ashkenazi (according to 23andMe)](/images/ancestry_report.png) explains my low IQ.) East Asians do a little better than Europeans/"whites". And—this is the part that no one is happy about—the difference between U.S. whites and U.S. blacks is about Cohen's _d_ ≈ 1. (If two groups differ by _d_ = 1 on some measurement that's normally distributed within each group, that means that the mean of the group with the lower average measurement is at the 16th percentile of the group with the higher average measurement, or that a uniformly-randomly selected member of the group with the higher average measurement has a probability of about 0.76 have having a higher measurement than a uniformly-randomly selected member of the group with the lower average measurement.) +This _should_ just be more social-science nerd stuff, the sort of thing that would only draw your attention if, like me, you feel bad about not being smart enough to do algebraic topology and want to console yourself by at least knowing about the Science of not being smart enough to do algebraic topology. The reason everyone _and her dog_ is still mad at Charles Murray a quarter century later is Chapter 13, "Ethnic Differences in Cognitive Ability", and Chapter 14, "Ethnic Inequalities in Relation to IQ". So, _apparently_, different ethnic/"racial" groups have different average scores on IQ tests. [Ashkenazi Jews do the best](https://slatestarcodex.com/2017/05/26/the-atomic-bomb-considered-as-hungarian-high-school-science-fair-project/), which is why I sometimes privately joke that the fact that I'm [only 85% Ashkenazi (according to 23andMe)](/images/ancestry_report.png) explains my low IQ. (I'm pretty dumb compared to some of my robot-cult friends.) East Asians do a little better than Europeans/"whites". And—this is the part that no one is happy about—the difference between U.S. whites and U.S. blacks is about Cohen's _d_ ≈ 1. (If two groups differ by _d_ = 1 on some measurement that's normally distributed within each group, that means that the mean of the group with the lower average measurement is at the 16th percentile of the group with the higher average measurement, or that a uniformly-randomly selected member of the group with the higher average measurement has a probability of about 0.76 have having a higher measurement than a uniformly-randomly selected member of the group with the lower average measurement.) -It's important not to overinterpret the IQ-scores-by-race results; there are a bunch of standard caveats that go here that everyone's treatment of the topic needs to include. Again, just because variance in a trait is statistically associated with variance in genes _within_ a population, does _not_ mean that differences in that trait _between_ populations are _caused_ by genes: [remember the illustrations about](#heritability-caveats) sun-deprived plants and internet-deprived red-haired children. Group differences in observed tested IQs are entirely compatible with the blank slate doctrine, a world in which those differences are entirely due to the environment imposed by an overtly or structurally racist society. Maybe the tests are culturally biased. Maybe people with higher socioeconomic status get more opportunities to develop their intellect, and racism impedes socio-economic mobility. And so on. +It's important not to overinterpret the IQ-scores-by-race results; there are a bunch of standard caveats that go here that everyone's treatment of the topic needs to include. Again, just because variance in a trait is statistically associated with variance in genes _within_ a population, does _not_ mean that differences in that trait _between_ populations are _caused_ by genes: [remember the illustrations about](#heritability-caveats) sun-deprived plants and internet-deprived red-haired children. Group differences in observed tested IQs are entirely compatible with a world in which those differences are entirely due to the environment imposed by an overtly or structurally racist society. Maybe the tests are culturally biased. Maybe people with higher socioeconomic status get more opportunities to develop their intellect, and racism impedes socio-economic mobility. And so on. -The problem is, a lot of the blank-slate-compatible hypotheses for group IQ differences become less compelling when you look into the details. "Maybe the tests are biased", for example, isn't an insurmountable defeater to the entire endeavor of psychometrics—it is _itself_ a falsifiable hypothesis, or can become one if you specify what you mean by "bias" in detail. If a test question were biased against a group, you would expect \ No newline at end of file +The problem is, a lot of the blank-slate-compatible hypotheses for group IQ differences become less compelling when you look into the details. "Maybe the tests are biased", for example, isn't an insurmountable defeater to the entire endeavor of IQ testing—it is _itself_ a falsifiable hypothesis, or can become one if you specify what you mean by "bias" in detail. One idea of what it would mean for a test to be _biased_ is if it's partially measuring something other than what it purports to be measuring: if your test measures a combination of "intelligence" and "submission to the hegemonic cultural dictates of the test-maker", then individuals and groups that submit less to your cultural hegemony are going to score worse, and if you _market_ your test as unbiasedly measuring intelligence, then people who believe your marketing copy will be misled into thinking that those who don't submit are dumber than they really are.