The mouse genome is only 160 megabytes, and contains the instructions for building the brain as well as building everything else, so the "secret sauce" of how to make an intelligent brain should not be extremely large, once you figure out how to do it. :) A lot of the actual connections must be either random, or encoding things the mouse learnt while growing up.
There are four bases, so one base encodes two bits of information. Eight bits are one byte, so four bases are one byte. 2500 megabases = 625 megabytes. So yeah, Parent was off by a factor of 5-6 :) . But still, that fits on one CD.
Except that currently genomics requires even more information to be encoded - such as quality scores, allele frequencies, phase information, ... - so, depending on the format, this estimate is off by either one or two orders of magnitude still.
Take a bunch of source code. Compile it, obfuscate it, compress it, and encrypt it with AES such that the result is a 160MB blob. Now see how long it takes to figure out what it does, given a computer that costs a lot of money just to load your program and a long time to give you a result. The upper bound on the complexity of DNA as it relates to the complete expression of an organism phenotype is insanely high.
Most developers think of the genome like a big load of source code, and if only we could work out where the if and for statements were we could read it. This is an extremely naive and overconfident point of view; the analogy between source code and genomes is very poor. The genome is coding for proteins (by way of RNA). Those proteins are subject to all of physics (think: electrostatics, hydrophobics, ....), whereas your code is an abstract entity designed to run on a rather simple analogue of a Turing machine. The complexity of life is much harder I am afraid. Though that never seems to stop developers assuming that they can create a crude analogy which explains it. Also, the size is totally wrong; see previous comment.
It's true that genes and proteins is nothing like code, but in the context of understanding the brain, I think that should be cause for optimism, because it means that nature has its hands tied behind its back. The genes can't just contain a description of how the brain should be wired together, because the description also has to be "self-executing"; the entire object must robustly self-assemble just from proteins physically interacting. So although 700 megabytes of mouse genes could potentially contain a lot of stuff, it might be possible to do the same thing much more simply if we can program a digital computer instead.
Like, the connectome for C. elegans has been mapped out; it's can be written down as a 2 megabyte ascii text file. Just the connectivity is not enough to actually reproduce the behavior of the worm, you would also need data about the weight of each connection, but it's still a lot less data than the worm genome (about 25 megabytes---I hope I got the number right this time!). The worm genes also need to contain a lot of additional stuff to build functioning cells internals, etc, stuff which hopefully is irrelevant to the actual cognition.
> whereas your code is an abstract entity designed to run on a rather simple analogue of a Turing machine.
I cannot adequately put the insane laugh required as response to that into text form. So I will only write this and be just as right: going by physics the brain of a mouse can be adequately approximated by a perfect sphere.
The definition of a turing machine is mathematically perfect. No threading, no IO, no error correction, no errors, no asynchronous events, no processes fighting over shared resources, no resources that might or might not disappear at the blink of an eye, in short no nothing. In that it is equivalent to a spherical brain, any complexity relevant to the problem at hand removed.
You're making things far too complex, and confusing the issue, and yourself, as a result. Let's turn to the first sentence from Wikipedia:
"A Turing machine ... manipulates symbols on a strip of tape according to a table of rules"
Whichever programming language you are fond of ultimately reduces to this mode of computation. However, with DNA, RNA, and Proteins, that is not the case. The way that we compute is simplistic compared with the way that biology computes. Thus: the crude analogy in fact hinders understanding, and should be discarded.
the "learning" can come from things like the compounds in various food and so on! "learning" is, in a generalized sense, any non-genetically-bootstrapped environment->body information transference...