Skip to main content

1.6 Phase transitions, a measure of learning

Phase transitions, a measure of learning


https://rumble.com/vcg8gw-1.6-phase-transitions-a-measure-of-learning.html

Phase transitions demonstrate a loose coupling.  A tight coupling like in KMeans mirror the distortion at each level.  A loose coupling enables the higher level to move at a different pace, disjoint, from the lower level.  This separation between levels, indicates that the levels represent different descriptions of the data, they speak different languages.

Important to differentiate between the model, the heirarchy, the grammar, and the content.  So next series I will do that.  The learning is in the model not the content.  The content can be memorized, its the relationships between the content that are learnt.

Here is a slightly different dataset, there are at least two apparent scales.  Lets see what KMeans does as we increase 'k':










Neat right, KMeans spreads its representations equally across the entire dataset, minimising the global loss of information.  Its a smooth transition from case to case, as we increase the number of centroids, KMeans shifts all of the centroids, such that it spreads out the representation.

Now take a look at faddc:








The thing to notice is that the representation is very stable up to a point, at a certain point there is a dramatic shift in the representation.   The faddc approach has created a new descriptive space, decoupled from the original data space.

What this means is that faddc preserves the representation even as the number of centroids changes.  This is like creating a word, 'citrus' for a category and then preserving the use of the word even as more data and more memory is allocated, as long as the concept of the category is stable, the word is stable.  

On the other hand, KMeans will change words each time, so if I start a conversation with you utilising the word 'citrus', mid-conversation I might start using a different word.  In effect words have no inherent meaning, they just constantly move to represent the global data.

Again this is because KMeans does not learn anything, it tries to memorise the original data as best it can.  While faddc has created a new descriptive space, decoupled from the original data space.

Here is a subtle point critical point, there are actually two inherent descriptive levels in the data.  The first is obvious, the feature space, for example the color of the objects, a red apple an orange orange.  The second descriptive space is less obvious.  The quantity of observations provides a second descriptive space.  This is more significant when we talk about the co-occurrence of features.  

So when multiple observation have the same relative features it is not that we learn only from the features.  We learn from the number of observations.  

This idea will become more significant when we talk about sequence analysis.  And the ability to factor out the sequence information independent of the content information.  




Comments

Popular posts from this blog

V) How do we know we made a reasonable judgement?

V) How do we know we made a reasonable judgement? I was by my brother in NY, on my way to the airport, and I spotted a book by Umberto Eco on information and open systems.  I borrowed the book (and still have it -- sorry Jacob),  just on the whim that I would enjoy more Eco in my life.  I discovered much more, the book is Eco's earlier writing, semiotics mixed with art and science, and has had a profound affect on me.  Eco makes the argument that Shannon's description of information, a measure of the communicability of a message, provides for a measure of art. If it helps think about 'On Interpretation' by Susan Sontag, experience art without interpreting it.  There is no message not even one that we the viewer creates.   There is no meaning to be had, just an experience.  The flip side of this argument is that when there is interpretation there is meaning.  This view, proposed by Semiotics, states that when two closed systems meet and are ...

0.0 Introduction to advanced concepts in AI and Machine Learning

Introduction to advanced concepts in AI and Machine Learning I created a set of short videos and blog posts to introduce some advanced ideas in AI and Machine Learning.  It is easier for me to think about them as I met them, chronologically in my life, but I may revisit the ideas later from a different perspective. I also noticed that one of things I am doing is utilising slightly off-centre tools to describe an idea.  So for example, I employ Kohonen Feature Maps to describe embeddings.  I think I gain a couple of things this way, first it is a different perspective than most people are used to.  In addition, well you will see :-) I recommend first opening the blog entry (as per the links below), then concurrently watching the linked video. Hope you enjoy these as much as I did putting them together, David Here are links: https://data-information-meaning.blogspot.com/2020/12/memorization-learning-and-classification.html https://data-information-meaning.blogspot.com/...

III) Metrics

III) Metrics One of these things is not like the other -- but two of these things are distant from a third. I grew up with Brisk Torah, more specifically my father was a Talmid of Rabbi Joseph Soloveichik and dialectic thinking was part and parcel of our discussions.  Two things, two dinim, the rhythm in the flow between two things.  Dialectics not dichotomies.  The idea espoused by the Rambam in his description of Love and Awe, mutually exclusive, we travel between them. Why create duality?  Dialectics or dichotomies provide a powerful tool, but what is it that tool? What is the challenge? I think the Rabbinic language might be נתת דברך לשיעורים, 'your words are given to degrees', the idea being that without clear definitions we are left with vague language, something is more than something else, ok, but how much more? This I think is the reasoning for the first of the twenty one questions I was taught by my father's mother, 'is it bigger than a breadbox?',...