epidemiology etiketine sahip kayıtlar gösteriliyor. Tüm kayıtları göster
epidemiology etiketine sahip kayıtlar gösteriliyor. Tüm kayıtları göster

"The Blizzard of 2016" and predictability: Part II: When is a prediction a good one? When is it good enough?

Weather forecasts require the prediction of many different parameter values.  These include temperature, wind at the ground and aloft (winds that steer storm systems, and where planes fly), humidity on the ground and in the air (that determines rain and snowfall), friction (related to tornadoes and thunderstorms), change over time and the track of these things across the surface with its own weather-affecting characteristics (like water, mountains, cities).  Forecasters have to model and predict all of these things.  In my day, we had to do it mainly with hand-drawn maps and ground observations--no satellites, basically no useful radar, only scattered ship reports over oceans, etc.), but of course now it's all computerized.

Other sciences are in the prediction business in various ways.  Genetic and other aspects of epidemiology are among them.  The widely made, now trendy promise of 'precision' medicine, or the predictions of what's good or bad for you, are clear daily examples.  But as with the weather, we need some criteria, or even some subjective sense of how good a prediction is.  Is it reliable enough to convince you to change how you live?

Yesterday, I discussed aspects of weather prediction and what people do in response, if anything.  Last weekend's big storm was predicted many days in advance, and it largely did what was being predicted.  But let's take a closer look and ask: How good is good enough for a prediction?  Did this one meet the standard?

Here are predicted patterns of snowfall depth, from the January 24th New York Times, the day after the storm, with data provided by the National Weather Service:



And now here are the measured results, as reported by various observers:




Are these well-forecast depths, or not?  How would you decide?  Clearly, the maximum snowfall reported (42") in the Washington area was a lot more than the '20+"' forecast, but is that nit-picking?  "20+" does leave a lot of leeway for additional snowfall, after all.  But, the prediction contour plot is very similar to the actual result. We are in State College, rather a weather capital because the Penn State Meteorology Department has long been a top-rated one and because Accuweather is located here as a result.  Our snowfall was somewhere between 7 and 10 inches.  The top prediction map shows us in the very light area, with somewhere between 1-5" and 7-10" expected, and the forecasts were for there to be a sharp boundary between virtually no snowfall, and a large dump.  A town only a few miles north of us had very few inches.

So was the forecast a good one, or a dud?

How good is a good forecast?
The answer to this fair question depends on the consequences.  No forecast can be perfect--not even in physics where deterministic mathematical theory seems to apply.  At the very least, there will always be measurement errors, meaning you can never tell exactly how good a prediction was.

As a lead-up to the storm's arrival in the east, I began checking a variety of commercial weather companies (AccuWeather, WeatherUnderground, the Weather Channel, WeatherBug) as well as the US National and the European Weather Services, interested in how similar they were.

This is an interesting question, because they all rely on a couple of major computer models of the weather, including an 'ensemble' of their forecasts. The local companies all use basically the same global data sources, and the same physical theory of fluid dynamics, and the same resulting numerical models.  They try to be original (that's the nature of the commercial outfits, of course, since they need to make sales, and even the government services want to show that they're in the public eye).

In the vast majority of cases, as in this one, the shared data from weather balloons, radar, ground reports, and satellite imagery, as well as the same physical theory, means that there really are only minor differences in the application of the theory to the computed models.  Data resources allow retrospective analysis to make corrections to the various models and see how each has been doing and adjust them.  For the curious, most of this is, rightly, freely available on the internet (thanks to its ultimately public nature).  Even the commercial services, as well as many universities, make data conveniently available.

In this case, the forecasts did vary. All more or less had us (State College) on a sharp edge of the advancing snow front.  Some forecasts had us getting almost no snow, others 1-3", others in the 5-8" range.  These varied within any given organization over time, as of course it should when better models become available.  But that's usually when D-day is closer and there is less extrapolation of the models, in that sense less accuracy or usefulness from a precision point of view.  At the same time, all made it clear that a big storm was coming and our location was near to the edge of real snowfall. They all also agreed about the big dump in the Washington area, but varied in terms of what they foresaw for New York and, especially, Boston.  Where most snow and disruption occurred, they gave plenty of notice, so in that sense the rest can be said to be details.  But if you expected 3" of snow and got a foot, you might not feel that way.

If you're in the forecasting business--be it for the weather or health risks based on, say, your genome or lifestyle exposures--you need to know how accurate forecasts are since they can lead to costly or even life-or-death consequences.  Crying wolf--and weather companies seem ever tempted to be melodramatic to retain viewers--is not good of course, but missing a major event could be worse, if people were not warned and didn't take precautions.  So it is important to have comparative predictions by various sources based on similar or even the same data, and for them to keep an eye on each other's reasons, and to adjust.

As far as accuracy and distance (time) is concerned, precision is a different sort of thing.  Here is the forecast by our local, excellent AccuWeather company for the next several days:

This and figure below from AccuWeather.com

And here is their forecast for the days after that.



How useful are these predictions, and how would you decide?  What minor or major decisions would you make, based on your answers?  Here nothing nasty is in the forecast, so if they blow the temperature or cloud over on the out-days of this span, you might grumble but you won't really care.

However, I'm writing this on Sunday, January 24.  The consensus of several online forecasts was all roughly like the above figures.  Basically smooth sailing for the week, with a southerly and hence warm but not very stormy air flow, and no significant weather.  But late yesterday, I saw one forecast for the possibility of another Big One like what we just had.  The forecaster outlined the similarities today with conditions ten days ago, and in a way played up the possibility of another one like it.  So I looked at the upper-air steering winds and found that they seem to be split between one that will steer cold arctic air down towards the southern and eastern US, and another branch that will sweep across the south including the most Gulf of Mexico and join up with the first branch in the eastern US, which is basically what happened last week!

Now, literally as I write, one online forecast outfit has changed its forecast for the coming week-end (just 5 days from now) to rain and possibly ice pellets.  Another site now asks "Could the eastern US face more snow later this week?" Another makes no such projection.  Go figure!

Now it's Monday.  One commercial site is forecasting basically nothing coming.  Another forecasts the probability of rain starting this weekend.  NOAA is forecasting basically nothing through Friday.

But here are screenshots from an AccuWeather video on Monday morning, discussing the coming week.  First, there is doubt as to whether the Low pressure system (associated with precipitation) will move up the east coast or farther out to sea.  The actual path taken, steered by upper-level winds, will make a big difference in the weather experienced in the east.

Source: AccuWeather.com

The difference in outcomes would essentially be because the relevant wind will be across the top of the Low, moving from east to west, that is, coming off the ocean onto land (air circulates as a counter-clockwise eddy around the center of the Low).  Rain or possibly snow will fall on land as the result.  How much, or how cold it will be depends on which path is taken.  This next shot shows a possible late-week scenario.

Source:  AccuWeather.com
The grey is the upper-level steering winds, but their actual path is not certain, as the prior figure showed, meaning that exactly where the Low will go is uncertain at present.  There just isn't enough data, and so there's too much uncertainty in the analysis, to be more precise at this stage.  The dry and colder air shown coming from the west would flow underneath the most air flowing in from offshore, pushing it up and causing precipitation.  If the flow is more eastward of the alternatives in the previous figure, the 'action' will mainly be out at sea.

Well, it's now Monday afternoon, and two sites I check are predicting little if anything as of the weekend....but another site is predicting several days in a row of rain.  And....(my last 'update'), a few hours later, the site is predicting 'chance of rain' for the same days.

To me, with my very rusty, and by now semi-amateur checking of various things, it looks as if there won't be anything dropping on us.  We'll see!

The point here is how much things change and how fast on little prior indication--and we are only talking about predicting a few days, not weeks, ahead.  The above AccuWeather video shows the uncertainty explicitly, so we're not being misled, just advised.

This level of uncertainty is relevant to biology, because meteorology is based on sophisticated, sound physics theory (hydrodynamics, etc.).  It lends itself to high-quality, very extensive and even exotic instrumentation and mathematical computer simulation modeling.  Most of the time, for most purposes, however, it is already an excellent system.  And yet, while major events like the Big Blizzard this January are predictable in general, if you want specific geographic details, things fall short.  It's a subjective judgment as to when one would say "short of perfection" rather than "short but basically right.".

With more instrumentation (satellites, radar, air-column monitoring techniques, and faster computers) it will get inevitably better.  Here's a reasonable case for Big Data.  However, because of measurement errors and minor fluctuations that can't be detected, inaccuracies accumulate (that is an early example of what is meant by 'chaotic' systems: the farther down the line you want to predict, the greater your errors.  Today, in meteorology, except in areas like deserts where things hardly change, I've been told by professional colleagues who are up to date, that a week ahead is about the limit.  After that, at least under conditions and locations where weather change is common, specific conditions today are no better than the climate average for that location and time of year.

The more dynamic a situation--changing seasons, rapidly altering air and moisture movement patterns, mountains or other local effects on air flow, the less predictable over more than a few days. You have to take such longer-range predictions with a huge grain of salt, understanding that they're the best theory and intuition and experience can do at present (and taking into account that it is better to be safe--warned--than sorry, and that companies need to promote their services with what we might charitably call energetic presentations).  The realities are that under all but rather stable conditions, such long-term predictions are misleading and probably shouldn't even be made: weather services should 'just say no' to offering them.

An important aspect of prediction these days, where 'precision' has recently become a widely canted promise, is in health.  Epidemiologists promise prediction based on lifestyle data.  Geneticists promise prediction based on genotypes.  How reliable or accurate are they now, or likely to become in the predictable future?  At what point does population average do as well as sophisticated models? We'll discuss that in tomorrow's installment.

"The Blizzard of 2016" and predictability: Part I--the value of prediction

Mark Twain famously quipped, "Everybody talks about the weather but nobody does anything about it." But these days, that's far from accurate.  At least, an army of specialists try to predict the weather so that we can be prepared for it.  The various media, as well as governmental agencies, publicize forecasts.  But how good are those forecasts?

As a former meteorologist myself (back--way back--when I was an Air Force weather officer), I take an interest, partly professional but also conceptual, in how accurate forecasting has become in our computer and satellite era.

Last week, a storm developed over the southwest, and combined with atmospheric disturbance barreling down from the Canadian arctic, to cause huge rain and wind damage across the south and then veered north where it turned into "The Blizzard of 2016", dubbed by the exaggeration-hungry media.  How well was it forecast and did that do any societal good?

Here is a past-few-days summary page of mapped conditions at upper air (upper left), surface (upper right) and other levels.  On a web page called eWall ( http://mp1.met.psu.edu/~fxg1/ewall.html ) you can scroll these for the prior 5 days.  The double Low pressure (red L's) on the right panel represent the center of the storm, steered in part by the winds aloft (other panels).



If you followed the forecasting over the week leading to the storm's storming up the east coast to wreak havoc there, you would say it was exceedingly well forecast, and many days in advance. Was it worth the cost?  One has to say that probably many lives were saved, huge damage avoided, and disruption minimized: people emptied grocery store shelves and hunkered down to watch the Weather Channel (and State College's own Accuweather).  Urgent things, including shopping for supplies in case of being house-bound, were done in advance and probably many medical and other similar procedures were done or rescheduled and the like.  Despite the very heavy snowfall, as predicted, the forecast was accurate enough to have been life-saving.

Lots of people still don't do anything about it!
And yet....
Despite a lot of people talking about the weather, on all sorts of public media, masses of people, acting like Mark Twain, don't do anything about it, even with the information in hand.  At least 12 people died in this storm in accidents, and others from coronaries while shoveling, and this is just what I've seen in a quick check of the online news outlets.  Thousands upon thousands were stranded for many hours in freezing cold on snow-sodden highways.  There were things like 25-mile-long stationary lines of vehicles on interstates and thousands of car and truck accidents.  That's a lot of people paying the price for their own stubbornness or ignorance.  This is what such a jam looks like:

A typical snowstorm traffic jam (www.breakingnews.com)
People were warned in the clearest terms for days in advance.  Our fine National Weather Service, in collaboration with complementary services in other countries, scoped out the situation and let everyone know about it, as is their very important job.  Some states, like New York,  properly closed their roads to all but necessary traffic. Their governments did their jobs.  Other states, like Kentucky, failed to do that.  So then, how is it that there was so much of what seems like avoidable damage?

Let's put the issue another way: My auto insurance rates will reflect the thousands of costly claims that will be filed because of those who failed to heed the warnings and were out on the highways anyway. So I paid for the forecasts first through my taxes, and then through the purchase prices of goods whose makers pay to advertise on weather channels, but then I also have to pay for those whose foolhardiness led to the many accidents they'll make claims for.  That's similar to people knowingly enjoying an unhealthy lifestyle, and then expecting health insurance to cover their medical bills--that insurance, too, is amortized over the population of insured including those who watch their lifestyles conscientiously.  That's the nature of insurance.

Some people, of course, simply can't stay home.  But many just won't.  Countless truckers were stranded on the roads.  They surely knew of the coming storm.  Did commercial pressure keep them on the road?  Then shame on their companies!  They surely could have pulled over or into Walmart parking lots to wait out the snowfall and its clearance--a day or so, say.  Maybe there aren't enough parking lots for that, but surely, surely they should not have been on the Interstates!  And while some people probably had strong legitimate reasons for being out, and a few may not have seen the strong, repeated forecasts over the many preceding days, most and I would say by far the most, just decided to take their trips anyway.

Nobody can say they aren't aware of pileups, crashes, and hours-long stalls that happen on Interstates during snowstorms.  It is not a new phenomenon!  Yet, again, we all will have to pay for their foolhardiness.  Maybe insurance should refuse to cover those on the road for unnecessary trips. Maybe those who clog the roads in this way should be taxed to cover the costs of, say, increased insurance rates on everyone else or emergencies that couldn't be dealt with because service vehicles couldn't get to the scene.

The National Weather Service, and companies who use their data, did a terrific job of alerting people of the coming storm, and surely saved many lives and prevented damage as a result.  Just as they do when they forecast hurricanes and warn of tornadoes.  Still, there are always people who ignore the warnings, at their own cost, and at cost to society, but that's not the fault of the NWS.

But what about predictability? Did they get it right?  What is 'right'?
It is a fair and important question to ask how closely the actual outcome of the storm was predicted.   The focus is on the accuracy in detail, not the overall result, and that leads one to examine the nature of the science and--of course in our case here on this blog--to compare it with the state of the art of epidemiological, including genetic, predictions.  Not all forecasts are as dramatic and in a sense clear-cut as a major storm like this one.

I have been in the 'prediction' business for decades, first as a meteorologist and subsequently in trying to understand the causal relationships, genetic and evolutionary, that explain our individual traits.  Tomorrow, we'll discuss aspects of the Big Storm's forecasts that weren't so accurate and compare that with the situation in these biological areas.

Food-Fight Alert!! Is cancer bad luck or environment? Part I: the basic issues

Not long ago Vogelstein and Tomasetti stirred the pot by suggesting that most cancer was due to the bad luck of inherent mutational events in cell duplication, rather than to exposure to environmental agents.  We wrote a pair of posts on this at the time. Of course, we know that many environmental factors, such as ionizing radiation and smoking, contribute causally to cancer because (1) they are known mutagens, and (2) there are dose or exposure relationships with subsequent cancer incidence. However, most known or suspected environmental exposures do not change cancer risk very much or if they do it is difficult to estimate or even prove the effect.  For the purposes of this post we'll simplify things and assume that what transforms normal cells into cancer cells is genetic mutations; though causation isn't always so straightforward, that won't change our basic storyline here.

Vogelstein and Tomasetti upset the environmental epidemiologists' apple cart by using some statistical analysis of cancer risks related, essentially, to the number of cells at risk, their normal time of renewal by cell division, and age (time as correlated with number of cell divisions).  Again simplifying, the number of at-risk actively dividing cells is correlated with the risk of cancer, as a function of age (reflecting time for cell mutational events), and with a couple of major exceptions like smoking, this result did not require including data on exposure to known mutagens.  V and T suggested that the inherently imperfect process of DNA replication in cell division could, in itself, account for the age- and tissue-specific patterns of cancer.  V and T estimated that except for the clear cases like smoking, a large fraction of cancers were not 'environmental' in the primary causal sense, but were just due, as they said, to bad luck: the wrong set of mutations occurring in some line of body cells due to inherent mutation when DNA is copied before cell division, and not detected or corrected by the cell.  Their point was that, excepting some clear-cut environmental risks such as ionizing and ultraviolet radiation and smoking, cancer can't be prevented by life-style changes, because its occurrence is largely due to the inherent mutations arising from imperfect DNA replication.

Boy, did this cause a stink among environmental epidemiologists!  Now one we think undeniable factor in this food fight is that environmental epidemologists and the schools of public health that support them (or, more accurately, that the epidemiologists support with their grants) would be put out of business if their very long, very large, and very expensive studies of environmental risk (and the huge percent of additional overhead that pays the schools' members meal-tickets) were undercut--and not funded and the money went elsewhere.  In a sense of lost pride, which is always a factor in science because it's run by humans, all that epidemiological work would go to waste, to the chagrin of many, if it was based on misunderstanding the basic nature of the mutagenic and hence carcinogenic processes.

So naturally the V and T explanation has been heavily criticized from within the industry.  But they will also raise the point, and it's a valid one, that we clearly are exposed to many different agents and chemicals that are the result of our culture and not inevitable and are known to cause mutations in cell culture, and these certainly must contribute to cancer risk.  The environmentalists naturally want the bulk of causation to be due to such lifestyle factors because (1) they do exist, and (2) they are preventable at least in principle.  They don't in principle object to the reality that inherent mutations do arise and can contribute to cancer risk, but they assert that most cancer is due to bad behavior rather than bad luck and hence we should concentrate on changing our behavior.

Now in response, a paper in Nature ("Substantial contribution of extrinsic risk factors to cancer development," Wu et al.) provides a statistical analysis of cancer data that is a rebuttal to V and T's assertions.  The authors present various arguments to rebut V and T's assertion that most cancer can be attributed to inherent mutation, and argue instead that external factors account for 70 to 90% of risk.  So there!

In fact, these are a variety of technical arguments, and you can judge which seem more persuasive (many blog and other commentaries are also available as this question hits home to important issues--including vested interests).  But nobody can credibly deny that both environment and inherent DNA replication errors are involved.  DNA replication is demonstrably subject to uncorrected mutational change, and that (for example) is what has largely driven evolution--unless epidemiologists want to argue that for all species in history, lifestyle factors were the major mutagens, which is plausible but very hard to prove in any credible sense.  

At the same time, environmental agents do include mutational effects of various sorts and higher doses generally mean more mutations and higher risk.  So the gist of the legitimate argument (besides professional pride or territoriality and preservation of public health's mega-studies) is really the relative importance of environment vs inherent processes.  The territoriality component of this is reminiscent of the angry assertion among geneticists, about 30 years ago, that environmental epidemiologists and their very expensive studies were soaking up all the money so geneticists couldn't get much of it.  That is one reason geneticists were so delighted when cheap genome sequencing and genetic epidemiological studies (like GWAS) came along, promising to solve problems that environmental epidemiology wasn't answering--to show that it's all in the genes (and so that's where the funding should go).  

But back to basic biology 
Cells in each of our tissues have their own life history.  Many or most tissues are comprised of specialized stem cells that divide and one of the daughter cells differentiates into a mature cell of that tissue type.  This is how, for example, the actively secreting or absorbing cells in the gut are produced and replaced during life.  Various circumstances inherent and environmentally derived can affect the rate of such cell division. Stimulating division is not the same as being a direct mutagen, but there is a confounding because more cell division means more inherent mutational accumulation.  That is, an environmental component can increase risk without being a mutagen and the mutation is due to inherent DNA replication error.  Cell division rates among our different tissues vary quite a lot, as some tissues are continually renewing during life, others less so, some renew under specific circumstances (e.g., pregnancy or hormonal cycles), and so on.

As we age, cell divisions slow down, also in patterned ways.  So mutations will accumulate more slowly and they may be less likely to cause an affected cell to divide rapidly.  After menopause, breast cells slow or stop dividing.  Other cells, as in the gut or other organs, may still divide, but less often.  Since mutation, whether caused by bad luck or by mutagenic agents, affects cells when they divide and copy their DNA, mutation rates and hence cancer rates often slow with advancing age.  So the rate of cancer incidence is age-specific as well as related to the size of organs and lifestyle stimulates to growth or mutation.  These are at least a general characteristics of cancer epidemiology.

It would be very surprising if there were no age-related aspect to cancer (as there is with most degenerative disease).  The absolute risk might diminish with lower exposure to environmental mutagens or mitogens, but the replicability and international consistency of basic patterns suggests inherent cytological etiology.  It does not, of course, in any sense rule out environmental factors working in concert with normal tissue activity, so that as noted above it's not easy to isolate environment from inherent causes.

Wu et al.'s analysis makes many assumptions, the data (on exposures and cell-counts) are suspect in many ways, and it is difficult to accept that any particular analysis is definitive.  And in any case, since both types of causation are clearly at work, where is the importance of the particular percentages of risk due to each?  Clearly strong avoidable risks should be avoided, but clearly we should not chase down every miniscule risk or complex unavoidable lifestyle aspect, when we know inherent mutations arise and we have a lot of important diseases to try to understand better, not just cancer.

Given this, and without discussing the fine points of the statistical arguments, the obvious bottom line that both camps agree on is that both inherent and environmental mutagenic factors contribute to cancer risk. However, having summarized these points generally, we would like to make a more subtle point about this, that in a sense shows how senseless the argument is (except for the money that's at stake). As we've noted before, if you take into account the age-dependency of risk of diseases of this sort, and the competing causes that are there to take us away, both sides in this food fight come away with egg on their face.  We'll explain what we mean, tomorrow.

Unsuspected death rates and miners' canaries

One of the major problems with health-risk prediction, from genetic or even other evidence, is that risks are estimated from past data but predictions are of course only for the future.  This is not a minor point!  Predictions are predicated on the assumption that what we've seen in the past will persist.  To a somewhat lesser extent, predictions based on measured risk factors are also based on the further assumption that variables used to estimate risk are causative and not just correlated with the outcome.

An inconvenient truth is that the two, retrospective and prospective analysis, are not the same and their connection hinges on these assumptions but the assumptions are by no means obviously true. We have written about this basic set of problems many times here.

Now a new study, which we saw first as reported here in the NY Times, is that while overall death rates generally have been dropping in the US, the authors note "the declining health and fortunes of poorly educated American whites.  In middle age, they are dying at such a high rate that they are increasing the death rate for the entire group of middle-aged white Americas, [authors] Dr. Deaton and Dr. Case found.....The mortality rate for whites 45-54 years old with no more than a high school education increased by 135 deaths per 100,000 people from 1999 to 2014."

This is very different from other developed countries, for this particular age group, a shown by this figure from the authors' PNAS paper, and deviates from the generally improving age-specific mortality rates in these countries.


From Deaton and Case, PNAS Nov 2015

There are lots of putative reasons for this observation. The main causes of death were suicides, drugs, and alcohol related diseases, as shown below by the second figure from their paper.  There were mental illnesses associated with financial stress, opiate misuse and so on.


From Deaton and Case, PNAS Nov 2015

There are sociological explanations for these results, results that other demographic investigators had apparently not noticed.  They do not seem to be mysterious, nor is there any suggestion of scientific errors involved.  Our point is a different one, based on these results being entirely true, as the seem to be.

When the future is unpredictable, to an unpredictable or unknowable extent
Why were these findings a surprise? First, perhaps, because nobody bothered to look carefully at this segment of our society or at these particular subsets of the data.  To this extent, predictions of disease based on GWAS and other association studies of risk will have used past exposure-outcome associations to predict today's disease occurrences.  But they'd have been notably inaccurate, because the factors Deaton and Case considered either were not considered and/or because behavioral patterns changed in ways that couldn't have been taken into account in past studies.  There may of course be other causes that these authors didn't observe or consider that account for some of the pattern they found, and there may be other subsets of populations that have lower or higher risks than expected, if investigators but happened to look for them.  There is, of course, no way to know what data, causes, or subsets one may have not known about, not been measured, or just not considered.

That is a profound problem with risk projections based on past observations.  The risk-factor assessments of the past were adjusted for various covariates in the usual way, but one can't know all of what one should include.  There is just no way to know that and, more profoundly, as a result no way to know how inaccurate one's risk projections are.  But that is not even the most serious issue.

Much deeper is the problem that even if all exposures and behaviors of study subjects from whom risk estimates were made by correlation studies, these have unknown and unknowable relevance to future risks.   The reason is that the exposures of people in the future to these same risk factors will change, even if their genomes don't (and, of course, no two current people have the same genome, nor the same as anyone's in studies on which risks were estimated).  Even if the per-dose effects were perfectly measured (no errors of any kind), the mixture of exposures to these factors will not be the same and hence the achieved risk will differ.   There is no way to know what that mix will be.

Worse, perhaps by far, is that future risk exposures are unknowable in principle.  If a new drug for treating people under financial stress, or a new recreational drug, or a new type of cell phone or video screen, or a new diet or behavioral fad comes along, it may substantially affect risk.  It will modify the mix of existing exposures, but its quantitative effect on risk simply cannot be factored into the predicted risks because we can't consider what we have no way to know about.

In conclusion
The current study is a miners' canary in regard to predictions of health risks, whether from genetic or environmental perspectives.  This particular study is retrospective, and just shows the impact of failure to consider variables, relative to what had been concluded (in this case, that there has been a general improvement of mortality rates).  The risk factors and mortality causes reported are within the general set of things we know about and the study in this case merely shows that mistakes in using the data and so on--not any form of cheating, bad measurement, etc.--is responsible for the surprise discovery.  These things can be easily corrected.

But the warning is that there are likely many factors related to health experience that are still not measured, but should be, and that there are also an unknown number that have not been measured, for the simple reason that they do not yet exist.  The warning canaries have been cheeping as loudly as they can for quite a while, both in regard to environmental and genomic epidemiology.  The fault lies not in canaries, but in miners' leaders, the scientific establishment, who don't care to hear their calls.

My grandmother's dementia and me

My father's mother had Alzheimer's disease, or dementia of some sort, as did her sister.  Both lived with us at different times when I was a child, my great-aunt until she died in the bedroom upstairs, and my grandmother until she was impossible for my parents to care for, at which time they found a very kind, very patient woman with a big house in the country, and she went to live there.

These two sisters, the only children in their family, were always close.  They both worked all their lives, and were extremely competent and very kind.  My great-aunt never married; her fiancé had gone off to fight in the Spanish-American war, but died during an outbreak of yellow fever in Florida before he ever got to Cuba.  But, she lived with a cousin for many years.  When my parents finally cleaned out the apartment after my great aunt died, one of the things they found in the attic was a skull that must have once been used for teaching anatomy.  No one had any clue how it ended up in that attic.  My parents have displayed in their living room for most of my life.  My mother's theory, after years of living with it, is that this is the skull of a poor man who was suffering from an abscessed tooth, and he shot himself in the head because he couldn't stand the pain.  Here's a sketch.

Sketch by A Buchanan


My grandmother married and had one child, my father.  My grandparents, my great-aunt and her cousin all lived perhaps half an hour from us, in the town where my father had grown up, and my grandfather drove them all to visit us on Sunday afternoons.  He loved driving -- he enjoyed taking my sisters and me for drives in the country. What I remember most about these drives was the overwhelming odor of his strong cigars.  (He used to enjoy shooting woodchucks, too, happy to be doing farmers such a favor.  I remember going with him and my grandmother once on such an outing, but I refused to take a shot, which disappointed him.  He would steady his gun on the roof of the car, aim and shoot.  He draped the one woodchuck he killed the day I was with him over the gate into the field he'd shot it in, so that the farmer would take note.  One Sunday when they came to visit, there was a bullet hole in the roof of the car, over the passenger side -- I don't remember that that was ever explained.)

Dementia does unpredictable things to people.  My great-aunt -- Aunt, we called her, as my father had -- was always cheerful and sweet, if a bit confused.  Every morning she would ask where she was, but she was still able to play cribbage with us, she loved having us comb her long thin hair, past grey, now yellowed, and pin it into a bun.  I don't remember that she ever fussed about anything.

My grandmother, on the other hand, was distraught with worry from the moment she woke, to the moment she went to bed, and probably long after that.  She would sit at the kitchen table all day every day, every few minutes asking the same worried questions in the same frantic way.  She was miserable.  Occasionally she was able to access a part of her brain that reminded her that she was confused, and that made things even worse.

Apart from being two different versions of the same heart wrenching story that could be told by so many people, this raises several questions.  Was this two sisters with very different forms of the same disease?  Or, did they have two different diseases?

And, did the fact that both his mother and his aunt had dementia mean that my father was at higher risk of dementia himself?  Apparently not, as he is now in his late 80's, still very active, very engaged, mentally and even physically.  In turn, does this mean that my sisters and I don't have to worry about dementia ourselves?

Or is it secular trends in Alzheimer's disease that we should pay attention to?
One measure of a condition's impact is its prevalence.  That is the fraction of the population at a given point in time that is affected.  A recent BBC Radio 4 program, More or Less, discussed changes in Alzheimer's prevalence over time, after a paper reporting (among many other things) decreased prevalence of dementia in the UK was published in The Lancet ("Global, regional, and national disability-adjusted life years (DALYs) for 306 diseases and injuries and healthy life expectancy (HALE) for 188 countries, 1990–2013: quantifying the epidemiological transition," Murray et al.). According to the study, prevalence of dementia in British people over age 65 has declined by more than 20% in the last 20 years; it's currently about 7 percent of that segment of the population.

This is in striking contrast to a recent report in the UK that estimates that 1/3 -- 33%!-- of the British children born in 2015 will have dementia in later life.  Tim Harford, presenter of More or Less, pointed out, though, that it's odd that this number was taken seriously by anyone, given that it is equivalent to thinking that predictions made 100 years ago, when AIDS wasn't known, antibiotics not yet discovered, and so on, would have any credibility. And, the 1/3 estimate was based on 20 year old data.  (A quick check of prevalence of dementia in the UK is a bit confusing -- many sites caution that the number of people with Alzheimer's disease is rising rapidly.  It's an Alzheimer's time bomb, they warn.  But, given that the population is both aging and increasing, this isn't, in itself, a surprise, or very meaningful in relation to individual biological risk because, again, it's the fraction of the population that is affected that is the significant statistic.  To be clearer, if more people live longer, even the same age-specific risk of getting a disease will lead to more people with the disease, that is, higher prevalence in the population.  Of course, the number of affected individuals is relevant to the health care burden.)

How predictable is dementia?
Carol Brayne, one of hundreds of authors on the Lancet report and interviewed for More or Less, speculates that the reported fall in prevalence has to do with changes in 'vascular health', as incidence of heart attacks and stroke have fallen as well.  She suggests that it seems as though the things we have been doing in western countries to prevent cardiovascular disease have been working.

But of course this assumes we know the cause of dementia, and that it's in some sense a cardiovascular disease.  But, we don't understand the cause nearly well enough to say this, and in fact, like most chronic diseases, dementia is many different conditions, with many different causes.

The genetic causal factors related to Alzheimer's disease include mutations in a few genes, but these account for only a fraction of cases.  Mutations in the two presenillin genes can lead to early onset Alzheimer's. The most commonly discussed genetic risk factor has to do with the E4 allele in the ApoE gene, whose physiology is related to fat transport in the blood.  It seems to be associated with the development of plaque in brains of people with late onset (60s and over) Alzheimer's, but the association is complex, people without the E4 allele also develop plaque, and people with plaque may not have dementia, and the causal mechanisms are unclear.  Risk seems to depend on whether one carries one or two copies of the E4 allele, and seems to be higher for women than for men, and is apparently affected by environmental factors, but it does seem to raise risk from something like 10-15% in people over 80 to 30-50%.

What this means, even if the statistics were reliable, the risk estimates stable, and environmental contributions minimal, is that it is obvious that even having two copies of the risk allele is not a guarantee of Alzheimer's disease. And, in some populations having two copies isn't associated with Alzheimer's at all (Nigeria, e.g.).  In addition, while the association with increased risk has long been described, the physiology is still not understood. GWAS have reported other genetic risk factors, but not nearly as consistently as ApoE4, nor as strong.

The reported decline in dementia prevalence is not new; we blogged in 2013 about dramatically decreasing rates in the UK, as well as in Denmark, as reported by Gina Kolata then.  So, how can it be declining rapidly, but the strongest risk factor we know of is genetic -- and the frequency of this variant is not changing enough to even begin to account for the data?  Or, is Carol Brayne right that dementia is a vascular disease, and vascular diseases are on the decline, so Alzheimer's is, too?

Indeed, even the definition of whether you 'have' Alzheimer's or not is changeable and not precise, and researchers don't even agree on what an Alzheimer's brain looks like.  A good discussion of these various factors, including social and economic aspects and the history of studies of Alzheimer's, is a book The Alzheimer Conundrum, by Margaret Lock, a fine medical anthropologist at McGill in Canada (and friend of ours).  

Can Alzheimer's be prevented?
The causes of Alzheimer's disease are so poorly understood that it's said that the best prevention is to exercise, quit smoking and maintain a social life.  Very generic advice that could apply to a lot of things!  If we don't know what causes it, and there are probably environmental risk factors, which we don't really understand, relevant past environmental agents are unknown, future environments impossible to predict, and genetic risk factors not good predictors, then we certainly don't know how to predict population prevalence rates, not to mention who is most likely to develop the disease.  (NB: this is pertinent to late-onset dementia; early-onset is more likely to have a genetic cause, and is thus more likely to be predictable.)

Given the experience of two generations in my family, should I or shouldn't I worry about developing dementia?  If my grandmother and great-aunt had the ApoE4 risk allele, my father may or may not, and my sisters and I may or may not.  If they did and my father does, it's a good example of an allele with "incomplete penetrance," for which either genetic background or environmental risk factors or both are also necessary.  Which makes predicting dementia difficult, whether or not we were to have the risk allele. If they didn't have it, something else caused their dementia, and we have no idea what that was.  Indeed, they were both social, never smoked, and walked to work for decades.

To me, as to most people, dementia is frightening.  But, obviously, my family history is useless in terms of determining my risk -- my grandmother had it, my father doesn't.

Still, every time I forget someone's name, I think of my grandmother.

Rare Disease Day and the promises of personalized medicine

O ur daughter Ellen wrote the post that I republish below 3 years ago, and we've reposted it in commemoration of Rare Disease Day, Febru...