gene expression etiketine sahip kayıtlar gösteriliyor. Tüm kayıtları göster
gene expression etiketine sahip kayıtlar gösteriliyor. Tüm kayıtları göster

Spooky action at a (short) distance

Entanglement in physics is about action that seems to transfer some sort of 'information' across distances at speeds faster than that of light.  Roughly speaking (I'm not a physicist!), it is about objects with states that are not fixed in advance, and could take various forms but must differ between them, and that are separated from each other.  When measurement is made on one of them, whatever the result, the corresponding object takes on its opposite state.  That means the states are not entirely due to local factors, and somehow the second object 'knows' what state the first was observed in and takes on a different state.

You can read about this in many places and understand it better than I do or than I've explained it here.  Albert Einstein was skeptical that this could occur, if the speed of light were the fastest possible speed.  So he famously called the findings as they stood at that time "Spooky action at a distance." But the findings have stood many specific tests, and seem to be real, however it happens.

Does life, too, have spooky action? 
I think the answer is: maybe so.  But it is at a very short distance, that within the nuclei of individual cells.  Organisms have multiple chromosomes and many species, like humans, have 2 instances of each (are 'diploid'), one inherited from each parent.  I say 'instances' rather than 'copies', because they are not identical to each other nor to those of the parent that transmitted each of them.  They are perhaps near copies, but mutation always occurs, even among the cells within each of us, so each cell differs from their contemporary somatic fellows and from what we inherited in our single-cell beginnings as a fertilized egg.

Many clever studies over many years have been documenting the 3-dimensional, context-specific conformation, or detailed physical arrangement of chromosomes within cells.  The work is variously known, but one catch-term is chromosome conformation capture, or 3C, and I'll use that here.  Unless or until this approach is shown to be too laden with laboratory artifact (it's quite sophisticated), we'll assume it's more or less right.

The gist of the phenomenon is that (1) a given cell type, under a given set of conditions, is using only a subset of its genes (for my  purposes here this generally means protein-coding genes proper); (2) these active genes are scattered along and between the chromosomes, with intervening inactive regions (genes not being used at the moment); (3) the cell's gene expression pattern can change quickly when its circumstances change, as it responds to environmental conditions, during cell division, etc.; (4) at least to some extent the active regions seem to be clustered physically together in expression-centers in the nucleus; (5) this all implies that there is extensive trans communication, coordinating, and physically juxtaposing, parts within and among each chromosome--there is action at a very short distance.

Even more remarkably, I think, this phenomenon seems somehow robust to speciation because related species have similar functions and similar sets of genes, but often their chromosomes have been extensively rearranged during their evolutionary separation. More than this: each person has different DNA sequences due to mutation, and different numbers of genes due to copy number changes (duplications, deletions); yet the complex local juxtapositions seem to work anyway.  At present this is so complicated, so regular, and so changeable and thus so poorly understood, that I think we can reasonably parrot Einstein and call it 'spooky'.

What this means is that chromosomes are not just randomly floating around like a bowl of spaghetti.   Gene expression (including transcribed non-coding RNAs) is thought to be based on the sequence-specific binding of tens of transcription factors in an expression complex that is (usually) just upstream of the transcribed part.  Since a given cell under given conditions is expressing thousands of condition-specific genes, there must be very extensive interaction or 'communication' in trans, that is, across all the chromosomes. That's because the cell can change its expression set very quickly.

The 3C results show that in a given type of cell under given conditions, the chromosomes are physically very non-randomly arranged, with active centers physically very near or perhaps touching each other.  How this massive plate of apparent-spaghetti even physically rearranges to get these areas together, without getting totally tangled up, yet to be quickly rearrangeable is, to me, spooky if anything in Nature is.  The entanglement, disentanglement, and re-entanglement happens genome wide, which is implicitly what the classical term  'polygenic' essentially recognized related to genetic causation, but is now being documented.

The usual approach of genetics these days is to sequence and enumerate various short functional bits as being coding, regulatory, enhancing, inhibiting, transcribing etc. other parts nearby.  We have long been able to analyze cDNA and decide which parts are being used for protein coding, at least. Locally, we can see why or how this happens, in the sense that we can identify the transcription factors and their binding sites, called promoters, enhancers and the like, and the actual protein or functional RNA codes.  We can find expression correlates by extracting them from cells and enumerating them.  3C analysis appears to show that these coding elements are, at least to some extent, found juxtaposed in various transcription hot-spots.

Is gene expression 'entangled'?
What if the molecular aspects of the 3C research were shown to be technical artifacts, relative to what is really going on?  I have read some skepticism about that, concerning what is found in single cells vs aggregates of 'identical' cells.  If 3C stumbles, will our idea of polygenic condition-specific gene usage change?   I think not.  We needn't have 3C data to show the functional results since they are already there to see (e.g., in cell-specific expression studies--cDNA and what ENCODE has found). If 3C has been misleading for technical or other reasons, it would just mean that something else just as spooky but different from the 3D arrangement that 3C detects, is responsible for correlating the genomewide trans gene usage.  And it's of course 4-dimensional since it's time-dependent, too.  So what I've said here still will apply, even if for some other, unknown or even unsuspected reason.

The existing observations on context-specific gene expression show that something 'entangles' different parts of the genome for coordinated use, and that can change very rapidly.  The same genome, among the different types of cells of an individual, can behave very differently in this sense. Somehow, its various chromosomal regions 'know' how to be, or, better put, are coordinated.  This seems at least plausibly to be more than just that a specific context-specific set of transcription factors (TFs) binds selectively near regions to be transcribed and changes in its thousands of details almost instantly.  What TFs?  and how does a given TF know which binding sites to grab or to release, moment by moment, since they typically bind enhancers or promoters of many different genes, not all of them expression-related.  And if you want to dismiss that, by saying for example that this has to do with which TFs are themselves being produced, or which parts of DNA are unwrapped at each particular time, then you're just bumping the same question about trans control up, or over, to a different level of what's involved.  That's no answer!

And there is even another, seemingly simpler example to show that we really don't understand what's going on: the alignment of homologues in the first stage of meiosis.  We've been taught that empirical and necessary fact about meiosis for many decades. But how do the two homologues find each other to align?  This is essentially just not mentioned, if anyone even was asking, in textbooks.  I've seen some speculative ideas, again involving what I'll call 'electromagnetic' properties of each chromosome but even their authors didn't really claim it was sufficient or definitive.  Just for examples, homologous chromosomes in a diploid individual have different rearrangements, deletions, duplications, and all sorts of heterozygous sequence details, yet by and large they still seem to find each other in meiosis.  Something's going on!

How might this be tested?
I don't have any answers, but I wonder if, on the hypothesis that these thoughts are on target, how we might set up some critical experiments to test this.  I don't know if we can push the analogy with tests for quantum entanglement or not, but probably not.

One might hope that 'all' we have to do is enumerate sequence bits to account for this action-at-a-distance, this very detailed trans phenomenon.  But I wonder......I wonder if there may be something entirely unanticipated or even unknown that could be responsible.  Maybe there are 'electromagnetic' properties or something akin to that, that are involved in such detailed 4D contextually relativistic phenomena.

Suppose that what happens at one chromosomal location (let's just call it the binding of a TF), directly affects whether that or a different TF binds somewhere else at the same time.  Whatever causes the first event, if that's how it works, the distance effect would be a very non-local phenomenon, one so central to organized life in complex organisms that, causally, is not just a set of local gene expressions.  Somehow, some sort of 'information' is at work very fast and over very short distances. It is the worst sort of arrogance to assume it is all just encoded in DNA as a code we can read off along the strand and that will succumb to enumerative local informatic sequence analysis.

The current kind of purely local hypothetical sequence enumeration-based account seems too ordinary--it's not spooky enough!

The delicious smell of eggs!

Paleontologists like to give names, often self-serving names, to new fossil specimens they unearth. In  part, they want to control the agenda, the species and hence evolutionary track they are revealing (for the first time, naturally!).  One naturally wants to be known as the person who discovered Hobjob Man (Homo hobjobensis).

Well, geneticists are people, too, with all the vanities that accompany that distinction.  They want to name their genes and show their insight.  That's why we have names like 'BRCA' for the 'breast cancer' genes, and countless other examples.  In fact, BRCA1 is, on current best understanding, a general-use, widely expressed gene whose coded protein is used to detect certain types of DNA mutations in the cell, mismatches (non-complementarity) between opposite nucleotides at the corresponding location on the two strands of the DNA molecule) and help fix them.  It is not the, or even a, gene 'for' breast cancer!  It received its name because mutations in the gene were discovered being transmitted among victims of breast cancer in large families. Once identified, risk associated with the gene could be documented without needed to track it in families.   Proper gene-naming should describe the chromosomal location or normal function, where known of a gene, not why or how it was discovered, and not suggesting that its purpose is to cause disease.  Even the discovery-based labeling is risky because genes often if not typically serve multiple functions.

Humorous names like 'sonic hedgehog' are not informative but at least not misleading. One interesting example concerns the 'olfactory receptor' or OR genes.  These genes code for a set of cell-surface receptor proteins, part of a larger family of such genes, that were found in the olfactory (odor-detecting) tissues, such as the lining of the nose in vertebrates like mice and humans.  There is a huge family of such genes, about 1000 in mammals, that have arisen of the eons by gene duplication (and deletion) events.  Our genomes have isolated OR genes and also many clusters, of a few or up to hundreds of adjacent OR genes.  These arose by gene duplication events (and some were lost by inaccurate DNA copying that chopped off parts of the gene), so the number of active and inactive current and former OR genes are included, varying somewhat in each of our genomes.

Big arrays of genes like these often are inaccurately duplicated when cells divide, including during the formation of sperm and egg cells.  The inaccuracy includes mutations that affect the coded OR protein of a given OR gene and hence among the many different OR genes.  This process, over the millennia, generates the huge number and variety of gene family members, of which the OR family is the largest.  In the case of ORs, the idea has been that, like the immune system, these genes enable us to discriminate among odors--a vital function for survival, finding mates, detecting enemies, and so on.  Because of their high level of sequence diversity, each OR gene's coded protein responds to (can detect) a different set of molecules that might pass through the airways.  This allows us to detect--and remember--specific odors, because the combination of responding ORs is unique to each odor.  Discovery of this clever way by which Nature allows us to discriminate what's in our environment was worthy of a Nobel prize to Richard Axle and Linda Buck in 2004.

The catch is that this only works because each nasal olfactory cell expresses only a single OR gene. How the others are shut off in that cell, but each of them is turned on in other olfactory cells is interesting, but not really understood.  At least, this elaborate system evolved for olfactory discrimination....didn't it?  After all, the genes are named for that!

Well, not so fast.  A recent paper by Flegel at el. in Frontiers in Molecular Biosciences, has looked for OR expression in individual mammal sperm cells.  It has concluded that these genes, on the surface of sperm cells, enable it to find and fertilize eggs.  As described by the authors, sperm cells locate egg cells in the female reproductive tract by various chemosensory/receptor means, in a process not fully understood. Various studies have found OR genes expressed on the surface of sperm cells, where they have been said to be involved in the movement mechanisms of sperm.  These authors checked all known OR genes for expression in human sperm cells (they looked for their RNA transcripts).  91 OR genes were detected as being expressed in this way.  They showed their presence in various sub-cellular compartments in the sperm cells, which may be suggestive of specific functions.

Interestingly, the authors claim they've been leaders in detecting 'ectopically' expressed OR gene transcripts (but they aren't the only people documenting such 'ectopic' expression; see this post from 2012).  Whether this is just transcriptional noise or really functional, the very term 'ectopic' suggests the problem with gene naming.  If they're in sperm cells, they aren't properly named as 'olfactory' receptors.  These authors detected varying numbers of OR genes in different samples.  Some of this can be experimental error, but if it is highly controlled variable expression, serious questions arise.  Many of the transcripts were from the antisense (opposite) DNA strand to the one that actually codes for a protein sequence.

The authors found some systematic locations of specific OR genes in the sperm cells, as shown here:

Localization in sperm of specific Olfactory Receptor genes.  Source: Flegel et al., see text.

The plausibility of these results is quite strange.  It is no surprise whatever to find that genes are used in multiple contexts.  But in this particular case, repeatable findings could mean sloppy transcription, so that actually important genes are near the OR genes and the latter are just transcribed and/or translated with out real function.  Of course, the authors suggest there must be some function because, essentially, of the apparent orderliness of the findings.  Yet this is very hard to understand.

OR genes vary presumably because each variant responds to differing odorant molecules.  With a repertoire of hundreds of genes, and only one expressed per olfactory neuron, we can distinguish, and remember, odors we have experienced.   For similar reasons, the genes are highly mutable--again, that keeps the detectability repertoire up.  Your brain needs to recognize which receptor cells a given odor triggers, in case of future exposure.  But the combination of reporting cells, that is, their specific ORs, shouldn't generally matter so long as the brain remembers.

That eggy aroma!
There is a huge burden of proof here.  Again it is not the multiple expression, but the suggested functions, that seem strange.  If the findings actually have to do with fertilization, what is the role of this apparently random binding-specificity, the basic purportedly olfactory repertoire strategy, of these genes on the sperm cells' surface?  How can a female present molecules that are specifically recognized by this highly individualistic OR repertoire in the male?  How can her egg cell or genital tract or whatever, present detectable molecules for the sperm to recognize? What is it that guides or attracts them, whose specificity is retained even though the OR genes themselves are so highly variable?

And of course one has to be exceedingly skeptical about antisense OR-specific RNAs having any function, if that is what is proposed.  It is more than hard to imagine what that might be, how it would work, or most importantly, how it would have evolved.  Is this a report of really striking new genetic mechanisms and evolution....or findings not yet clear between function and noise?

The mechanism is totally unclear at present, and the burden of proof a major one.  Given that others have reported that OR genes are expressed in other cells, the evidence suggests that such expression is clearly believable, whatever the reason. Indeed, years ago it was speculated that they might serve to identify body cells with unique OR-based 'zip codes' for various internal use as we recognize which cell is in which tissue and the like.

Sperm- and/or testis-specific expression of at least some OR genes has also been observed before, as these authors note, but with less extensive characterization.  Is it functional, or just sloppy genome usage?  Time will tell.  The sperm cells are programmed to know the delicious smell of freshly prepared eggs.  Now, perhaps the next check should be to see whether the same sperm cells are also looking for (or lured by) the aroma of freshly fried bacon!

But if it is another use of a specific cell-identification system, of which olfactory discrimination is but one use, then it will be consistent with the well-known opportunistic nature of evolution.  There are countless precedents.  How this one evolved will be interesting to know and, perhaps especially, to learn whether olfaction was its initial use, or one adopted after some earlier--perhaps fertilization-related--function had already evolved.

But for our purposes today, the clear lesson, at least, should be the problem of coining gene names inaptly assigned because of their first-discovered function (or, in our view, because some whimsical geneticist liked a particular movie or cartoon character, like Sonic Hedghog).

Rare Disease Day and the promises of personalized medicine

O ur daughter Ellen wrote the post that I republish below 3 years ago, and we've reposted it in commemoration of Rare Disease Day, Febru...