Skip to main content

How does learning work, how much data is needed to learn and don't cross the streams!

Learning is the process by which a model is constructed, the model describes a set of observations.  The more compact the model, the better the learning process is considered.  This is manifest in the ability of the model to predict and generalize (out-of-sample) data.  But let's not confuse learning with classification.  Again, the essence of learning is the model construction and the condensed representation of the observations.

So how many observations, data elements, are required to construct a model? 

[Nassim Taleb addresses this question here: https://arxiv.org/pdf/1802.05495.pdf]

The typical answer is for a Gaussian/normal distribution, 30 observations, simple.  We construct a model of the mean and variance of the data by calculating the average and variance from our 30 sample observations.

Clearly this is not true in all cases, we do not always have simple normal distributions.  And in more complex case we would require more observations.  But let's assume the magic number of 30 is true.

So what happens when you increase the dimensionality of the data.  Well the 'curse of dimensionality' takes over.  The number of observations increases exponentially.  So now a relatively simple model of two or three dimensions, for example macro nutrients Carbs/Protein/Fat, would require 900 to 27000 observations.  This says, that if we wanted to describe the effect a diet has on a person and we measure the macro nutrients, we would need a sample size in the tens of thousands.

Now what happens when we go to something like micro nutrients, well there are nine essential amino acids, that gives us 30^9, a very large number (19683000000000). So even in an ideal case where all other confounding variables were isolated, you would still need an enormous test population (actually two or three, control/placebo groups as well).

This is why Prof. John Ionidies says: "Risk-conferring nutritional combinations may vary by an individual’s genetic background, metabolic profile, age, or environmental exposures. Disentangling the potential influence on health outcomes of a single dietary component from these other variables is challenging, if not impossible" [my emphasis], John P. A. Ioannidis, MD, DSc, The Challenge of Reforming Nutritional Epidemiologic Research

What does that mean in practice?  It means that all research based on observational data done today vastly underestimates the amount of data they need.  Yep, nothing published is good science.  Sugar is bad for you? Fat is good? Eggs? Vaccines?

But wait you say, that can't be, I know somethings work.  Gravity seems to be true and it is based on observational data (at least at first it was).

There are two answers to this, first our subjective definition of truth, gravity is true, is bolstered by a good argument.  We perceive the fact as being true if we have confidence in the fact, and even bad data science provides confidence.

But the better answer is that when we analyze gravity we simplify the problem space, it ends up being a single dimensional problem, and we don't need that many observations, thirty is enough.

Wait, gravity is complicated, just measuring the canon ball vs. the musket ball confused lots of people, including Galileo.  There are factors such as wind resistance that confound the variables and confuse the measurements.  But we simplified.

The trick to simplification is abstraction.  Going up a level of representation....

 




Comments

Popular posts from this blog

V) How do we know we made a reasonable judgement?

V) How do we know we made a reasonable judgement? I was by my brother in NY, on my way to the airport, and I spotted a book by Umberto Eco on information and open systems.  I borrowed the book (and still have it -- sorry Jacob),  just on the whim that I would enjoy more Eco in my life.  I discovered much more, the book is Eco's earlier writing, semiotics mixed with art and science, and has had a profound affect on me.  Eco makes the argument that Shannon's description of information, a measure of the communicability of a message, provides for a measure of art. If it helps think about 'On Interpretation' by Susan Sontag, experience art without interpreting it.  There is no message not even one that we the viewer creates.   There is no meaning to be had, just an experience.  The flip side of this argument is that when there is interpretation there is meaning.  This view, proposed by Semiotics, states that when two closed systems meet and are ...

0.0 Introduction to advanced concepts in AI and Machine Learning

Introduction to advanced concepts in AI and Machine Learning I created a set of short videos and blog posts to introduce some advanced ideas in AI and Machine Learning.  It is easier for me to think about them as I met them, chronologically in my life, but I may revisit the ideas later from a different perspective. I also noticed that one of things I am doing is utilising slightly off-centre tools to describe an idea.  So for example, I employ Kohonen Feature Maps to describe embeddings.  I think I gain a couple of things this way, first it is a different perspective than most people are used to.  In addition, well you will see :-) I recommend first opening the blog entry (as per the links below), then concurrently watching the linked video. Hope you enjoy these as much as I did putting them together, David Here are links: https://data-information-meaning.blogspot.com/2020/12/memorization-learning-and-classification.html https://data-information-meaning.blogspot.com/...

III) Metrics

III) Metrics One of these things is not like the other -- but two of these things are distant from a third. I grew up with Brisk Torah, more specifically my father was a Talmid of Rabbi Joseph Soloveichik and dialectic thinking was part and parcel of our discussions.  Two things, two dinim, the rhythm in the flow between two things.  Dialectics not dichotomies.  The idea espoused by the Rambam in his description of Love and Awe, mutually exclusive, we travel between them. Why create duality?  Dialectics or dichotomies provide a powerful tool, but what is it that tool? What is the challenge? I think the Rabbinic language might be נתת דברך לשיעורים, 'your words are given to degrees', the idea being that without clear definitions we are left with vague language, something is more than something else, ok, but how much more? This I think is the reasoning for the first of the twenty one questions I was taught by my father's mother, 'is it bigger than a breadbox?',...