Skip to main content

1.65 Phase transitions, a measure of learning

Phase transitions, a measure of learning

https://rumble.com/vcj8fk-1.65-phase-transitions-a-measure-of-learning.html

Lets compare KMeans to faddc

KMeans faddc

That was quite dramatic, how did we get there:

KMeans faddc

Neat right, KMeans spreads its representations equally across the entire dataset, minimising the global loss of information

The thing to notice with faddc is that the representation is very stable up to a point, at a certain point there is a dramatic shift in the representation.  


Here I graph the derivative energy, the change, the difference between the distortion given each 'k'.


Here is the code:

import numpy as np

X2 = (X / 4) + np.array([20])

X2 = X2.sample(X2.shape[0] // 10)

data = X.append(X1).append(X2)

plt.close()

plt.scatter(data[0], data[1])


from sklearn.cluster import KMeans

dists = []

for k in range(2101):

    faddc = Faddc(n_cluster=k, feature_count=2)

    faddc.fit(data.values)

        

    centroids_df = pd.DataFrame(faddc.m_centroids)

    plt.close()

    plt.scatter(X[0], X[1])

    plt.scatter(X1[0], X1[1])

    plt.scatter(X2[0], X2[1])

    plt.scatter(centroids_df[0], centroids_df[1], s=faddc.m_count)

    # k - 1, since the last centroids is always just the last data point

    plt.savefig('faddc_k' + str(k - 1) + '_2scales.png', dpi=300)


    kmeans = KMeans(n_clusters=k - 1, random_state=0).fit(data)

    centroids_df = pd.DataFrame(kmeans.cluster_centers_)

    plt.close()

    plt.scatter(X[0], X[1])

    plt.scatter(X1[0], X1[1])

    plt.scatter(X2[0], X2[1])


    labels_df = pd.DataFrame(kmeans.labels_)

    count_df = labels_df[0].value_counts()

    count_df = count_df.reset_index().sort_values('index')[0]  # strange little thing, need to ensure it is ordered correctly for plot

    plt.scatter(centroids_df[0], centroids_df[1], s=count_df)

    plt.savefig('kmeans_k' + str(k - 1) + '_2scales.png', dpi=300)


    y, y_err = faddc.predict(data.values)

    dists.append([k, np.min(y_err, axis=1).sum(), kmeans.inertia_])


dists_df = pd.DataFrame(dists, columns=['k''faddc''kmeans'])

dists_df['faddc_diff'] = dists_df['faddc'] - dists_df['faddc'].shift(-1)

dists_df['kmeans_diff'] = dists_df['kmeans'] - dists_df['kmeans'].shift(-1)

dists_df['k'] = dists_df['k'] - 1

plt.close()

dists_df['faddc_diff'] = dists_df['faddc_diff'] / dists_df['faddc_diff'].max()

dists_df['kmeans_diff'] = dists_df['kmeans_diff'] / dists_df['kmeans_diff'].max()

plt.plot(dists_df['k'], dists_df['faddc_diff'], label='faddc')

plt.plot(dists_df['k'], dists_df['kmeans_diff'], label='kmeans')

Comments

Popular posts from this blog

III) Metrics

III) Metrics One of these things is not like the other -- but two of these things are distant from a third. I grew up with Brisk Torah, more specifically my father was a Talmid of Rabbi Joseph Soloveichik and dialectic thinking was part and parcel of our discussions.  Two things, two dinim, the rhythm in the flow between two things.  Dialectics not dichotomies.  The idea espoused by the Rambam in his description of Love and Awe, mutually exclusive, we travel between them. Why create duality?  Dialectics or dichotomies provide a powerful tool, but what is it that tool? What is the challenge? I think the Rabbinic language might be נתת דברך לשיעורים, 'your words are given to degrees', the idea being that without clear definitions we are left with vague language, something is more than something else, ok, but how much more? This I think is the reasoning for the first of the twenty one questions I was taught by my father's mother, 'is it bigger than a breadbox?',...

0.0 Introduction to advanced concepts in AI and Machine Learning

Introduction to advanced concepts in AI and Machine Learning I created a set of short videos and blog posts to introduce some advanced ideas in AI and Machine Learning.  It is easier for me to think about them as I met them, chronologically in my life, but I may revisit the ideas later from a different perspective. I also noticed that one of things I am doing is utilising slightly off-centre tools to describe an idea.  So for example, I employ Kohonen Feature Maps to describe embeddings.  I think I gain a couple of things this way, first it is a different perspective than most people are used to.  In addition, well you will see :-) I recommend first opening the blog entry (as per the links below), then concurrently watching the linked video. Hope you enjoy these as much as I did putting them together, David Here are links: https://data-information-meaning.blogspot.com/2020/12/memorization-learning-and-classification.html https://data-information-meaning.blogspot.com/...

No a penguin is not an ashcan; how to evolve from supervised to semi-supervised learning

Recently Yann LeCun complemented the authors of 'A ConvNet for the 2020s' https://mobile.twitter.com/ylecun/status/1481194969830498308?s=20 https://github.com/facebookresearch/ConvNeXt These statements imply that continued improvements in the metrics of success are indicators that learning is improving.  Furthermore says LeCun, common sense reinforces this idea that 'helpful tricks' are successful in increasing the learning that occurs in these models.  But are these models learning? Are they learning better? Or perhaps they have succeeded at overfitting and scoring better but have not learnt anything new. We took a look at the what the model learned, not just how it scored on its own metric.  To this end we created a graph with links between each image and its top 5 classifications, the weights of the links are in proportion to the score of the class.   Here are the data files: https://github.com/DaliaSmirnov/imagenet_research/blob/main/prediction_df_resnet50.p...