Skip to main content

No a penguin is not an ashcan; how to evolve from supervised to semi-supervised learning

Recently Yann LeCun complemented the authors of 'A ConvNet for the 2020s'

https://mobile.twitter.com/ylecun/status/1481194969830498308?s=20

https://github.com/facebookresearch/ConvNeXt



These statements imply that continued improvements in the metrics of success are indicators that learning is improving.  Furthermore says LeCun, common sense reinforces this idea that 'helpful tricks' are successful in increasing the learning that occurs in these models. 



But are these models learning? Are they learning better? Or perhaps they have succeeded at overfitting and scoring better but have not learnt anything new.

We took a look at the what the model learned, not just how it scored on its own metric.  To this end we created a graph with links between each image and its top 5 classifications, the weights of the links are in proportion to the score of the class.  

Here are the data files:
https://github.com/DaliaSmirnov/imagenet_research/blob/main/prediction_df_resnet50.parq?raw=true
https://github.com/DaliaSmirnov/imagenet_research/blob/main/prediction_df.convnext.parq?raw=true


As you can see the ConvNeXt model thinks king penguins are similar to ashcans.  They are not.  Not but any metric, not semantically not visually, just not.



Moreover if we compare ResNet50 (a fairly old model with a significantly lower 'score') to the latest and greatest ConvNeXt we can see that both models basically learnt the same things.

Here is ResNet50


and here is ConvNeXt:

Very similar, while ResNet50 does not connect the world of electric ray and stingray as strongly as ConvNeXt both models still basically seem the same.

Another way to analyze the models is to take a look at their semantic maps.  These maps are generated from the graph above.

Here is ResNet50 

ResNet50 places ashcan in the same group as barrels.

ConvNeXt (below) does something different, it breaks up the semantic hierarchy in a telling way.


At the lowest level of the tree ConvNeXt thinks 'ashcan' and 'king_penguin' are very similar, however up a level the semantic environment is more clearly about cans & barrels.   This makes me think that king penguins have a cylindrical shape that is identified by ConvNeXt.

So are these models learning? Are they learning better? I don't think so, I think they have succeeded at overfitting and scoring better but have not learnt anything new.

------
This technique we are introducing perhaps can help solve the problem it highlights.  We propose to look at the semantically rich information available in the graph and in particular comparative graphs composed by different models.

In a recent interview with Lex Fridman, LeCun discusses the limits of information available to supervised learning and the potential information available to semi-supervised learning.

https://www.youtube.com/watch?v=SGzMElJ11Cc

So perhaps a two stage learning methodology is called for, where stage one is a simple supervised learning with limited information followed by stage two a semi-supervised graph learning with rich contextual information.




Comments

Popular posts from this blog

V) How do we know we made a reasonable judgement?

V) How do we know we made a reasonable judgement? I was by my brother in NY, on my way to the airport, and I spotted a book by Umberto Eco on information and open systems.  I borrowed the book (and still have it -- sorry Jacob),  just on the whim that I would enjoy more Eco in my life.  I discovered much more, the book is Eco's earlier writing, semiotics mixed with art and science, and has had a profound affect on me.  Eco makes the argument that Shannon's description of information, a measure of the communicability of a message, provides for a measure of art. If it helps think about 'On Interpretation' by Susan Sontag, experience art without interpreting it.  There is no message not even one that we the viewer creates.   There is no meaning to be had, just an experience.  The flip side of this argument is that when there is interpretation there is meaning.  This view, proposed by Semiotics, states that when two closed systems meet and are ...

0.0 Introduction to advanced concepts in AI and Machine Learning

Introduction to advanced concepts in AI and Machine Learning I created a set of short videos and blog posts to introduce some advanced ideas in AI and Machine Learning.  It is easier for me to think about them as I met them, chronologically in my life, but I may revisit the ideas later from a different perspective. I also noticed that one of things I am doing is utilising slightly off-centre tools to describe an idea.  So for example, I employ Kohonen Feature Maps to describe embeddings.  I think I gain a couple of things this way, first it is a different perspective than most people are used to.  In addition, well you will see :-) I recommend first opening the blog entry (as per the links below), then concurrently watching the linked video. Hope you enjoy these as much as I did putting them together, David Here are links: https://data-information-meaning.blogspot.com/2020/12/memorization-learning-and-classification.html https://data-information-meaning.blogspot.com/...

III) Metrics

III) Metrics One of these things is not like the other -- but two of these things are distant from a third. I grew up with Brisk Torah, more specifically my father was a Talmid of Rabbi Joseph Soloveichik and dialectic thinking was part and parcel of our discussions.  Two things, two dinim, the rhythm in the flow between two things.  Dialectics not dichotomies.  The idea espoused by the Rambam in his description of Love and Awe, mutually exclusive, we travel between them. Why create duality?  Dialectics or dichotomies provide a powerful tool, but what is it that tool? What is the challenge? I think the Rabbinic language might be נתת דברך לשיעורים, 'your words are given to degrees', the idea being that without clear definitions we are left with vague language, something is more than something else, ok, but how much more? This I think is the reasoning for the first of the twenty one questions I was taught by my father's mother, 'is it bigger than a breadbox?',...