PZ Myers - a biologist and associate professor at the University of Minnesota, Morris - as a thing or three to say about Ray Kurzweil's claim that we'll reverse engineer the brain by 2020. Personal attacks aside, he makes a strong point.
There he goes again, making up nonsense and making ridiculous claims that have no relationship to reality. Ray Kurzweil must be able to spin out a good line of bafflegab, because he seems to have the tech media convinced that he's a genius, when he's actually just another Deepak Chopra for the computer science cognoscenti.
His latest claim is that we'll be able to reverse engineer the human brain within a decade. By reverse engineer, he means that we'll be able to write software that simulates all the functions of the human brain. He's not just speculating optimistically, though: He's building his case on such awfully bad logic that I'm surprised anyone still pays attention to that kook.
Sejnowski says he agrees with Kurzweil's assessment that about a million lines of code may be enough to simulate the human brain.
Here's how that maths works, Kurzweil explains: The design of the brain is in the genome. The human genome has three billion base pairs or six billion bits, which is about 800 million bytes before compression, he says. Eliminating redundancies and applying loss-less compression, that information can be compressed into about 50 million bytes, according to Kurzweil.
About half of that is the brain, which comes down to 25 million bytes, or a million lines of code.
I'm very disappointed in Terence Sejnowski for going along with that nonsense.
See that sentence I bolded up there? That's his fundamental premise, and it is utterly false. Kurzweil knows nothing about how the brain works. It's design is not encoded in the genome: What's in the genome is a collection of molecular tools wrapped up in bits of conditional logic, the regulatory part of the genome, that makes cells responsive to interactions with a complex environment. The brain unfolds during development, by means of essential cell:cell interactions, of which we understand only a tiny fraction. The end result is a brain that is much, much more than simply the sum of the nucleotides that encode a few thousand proteins. He has to simulate all of development from his codebase in order to generate a brain simulator, and he isn't even aware of the magnitude of that problem.
We cannot derive the brain from the protein sequences underlying it; the sequences are insufficient as well, because the nature of their expression is dependent on the environment and the history of a few hundred billion cells, each plugging along interdependently. We haven't even solved the sequence-to-protein-folding problem, which is an essential first step to executing Kurzweil's clueless algorithm. And we have absolutely no way to calculate in principle all the possible interactions and functions of a single protein with the tens of thousands of other proteins in the cell!
Let me give you a few specific examples of just how wrong Kurzweil's calculations are. Here are a few proteins that I plucked at random from the NIH database; all play a role in the human brain.
First up is RHEB (Ras Homolog Enriched in Brain). It's a small protein, only 184 amino acids, which Kurzweil pretends can be reduced to about 12 bytes of code in his simulation. Here's the short description.
MTOR (FRAP1; 601231) integrates protein translation with cellular nutrient status and growth signals through its participation in 2 biochemically and functionally distinct protein complexes, MTORC1 and MTORC2. MTORC1 is sensitive to rapamycin and signals downstream to activate protein translation, whereas MTORC2 is resistant to rapamycin and signals upstream to activate AKT (see 164730). The GTPase RHEB is a proximal activator of MTORC1 and translation initiation. It has the opposite effect on MTORC2, producing inhibition of the upstream AKT pathway (Mavrakis et al., 2008).
Got that? You can't understand RHEB until you understand how it interacts with three other proteins, and how it fits into a complex regulatory pathway. Is that trivially deducible from the structure of the protein? No. It had to be worked out operationally, by doing experiments to modulate one protein and measure what happened to others. If you read deeper into the description, you discover that the overall effect of RHEB is to modulate cell proliferation in a tightly controlled quantitative way. You aren't going to be able to simulate a whole brain until you know precisely and in complete detail exactly how this one protein works.
And it's not just the one. It's all of the proteins. Here's another: FABP7 (Fatty Acid Binding Protein 7). This one is only 132 amino acids long, so Kurzweil would compress it to 8 bytes. What does it do?
Anthony et al. (2005) identified a Cbf1 (147183)-binding site in the promoter of the mouse Blbp gene. They found that this binding site was essential for all Blbp transcription in radial glial cells during central nervous system (CNS) development. Blbp expression was also significantly reduced in the forebrains of mice lacking the Notch1 (190198) and Notch3 (600276) receptors. Anthony et al. (2005) concluded that Blbp is a CNS-specific Notch target gene and suggested that Blbp mediates some aspects of Notch signaling in radial glial cells during development.
Again, what we know of its function is experimentally determined, not calculated from the sequence. It would be wonderful to be able to take a sequence, plug it into a computer and have it spit back a quantitative assessment of all of its interactions with other proteins, but we can't do that, and even if we could, it wouldn't answer all the questions we'd have about its function, because we'd also need to know the state of all of the proteins in the cell, and the state of all of the proteins in adjacent cells, and the state of global and local signalling proteins in the environment. It's an insanely complicated situation, and Kurzweil thinks he can reduce it to a triviality.
To simplify it so a computer science guy can get it, Kurzweil has everything completely wrong. The genome is not the program; it's the data. The program is the ontogeny of the organism, which is an emergent property of interactions between the regulatory components of the genome and the environment, which uses that data to build species-specific properties of the organism. He doesn't even comprehend the nature of the problem, and here he is pontificating on magic solutions completely free of facts and reason.
I'll make a prediction, too. We will not be able to plug a single unknown protein sequence into a computer and have it derive a complete description of all of its functions by 2020. Conceivably, we could replace this step with a complete, experimentally derived quantitative summary of all of the functions and interactions of every protein involved in brain development and function, but I guarantee you that won't happen either. And that's just the first step in building a simulation of the human brain derived from genomic data. It gets harder from there.
I'll make one more prediction. The media will not end their infatuation with this pseudo-scientific dingbat, Kurzweil, no matter how uninformed and ridiculous his claims get.