Skip to main content

Measuring intelligence

Listening to Dr. Stephen C. Meyer, here is a quote from the web:

“But it is also necessary to distinguish Shannon information from information that performs a function or conveys a meaning. We must distinguish sequences of characters that are (a) merely improbable from sequences that are (b) improbable and also specifically arranged so as to perform a function. That is, we must distinguish information-carrying capacity from functional information.” 
― Stephen C. Meyer, Signature in the Cell: DNA and the Evidence for Intelligent Design

Meyers speaks of 'information that performs a function or conveys a meaning', I think we are employing different definitions of those key words.  Let me try and define his message with my terms.

Data: a recorded observation.  Data is a thing, an object, as perceived.

Information: a metric of data.  Just like we measure the weight of things in pounds or grams, we can measure that quantity of data in Information.

Meaning: the result of mapping two systems of data to each other -- this is the semiotic definition of meaning.

So within my terms, information can not convey a meaning, just like grams cannot convey anything about what is being measured, if I say that an apple weighs 150 grams, in of itself this is meaningless.  I need to say an apple weighs 10 grams more than a tangerine.  Similarly, information can not perform a function, just like 10 grams of a an apple do not perform a function.  Metrics describe a relative relationship between data elements.  An orange weighs more than a tangerine.  A document has more information than a tweet.

Complexity is a funny word, on the one hand it is synonymous with information, Shannon Information is equivalent to Shannon Complexity.  On the other hand the word complexity is also utilized to describe a function.  Kolmogorov complexity is a measure of how complex a program would need to be to create the data.

So it would be easy to rephrase Meyer's statement as: DNA is a complex message, this is proven by the measure of Kolmogorov complexity.  However, Kolmogorov complexity is actually the thing we are trying to describe, so it would be self referential. Right, since Kolmogorov complexity requires an intelligent designer to create the program.  What we are looking for is a clean way of objectively describing the information in a sequence of text/DNA, such that we can differentiate between Kolmogorov complex sequences vs non-Kolmogorov complex sequences.

So how to measure the functionality of data, i.e. how much work does the data contain?  Can this data perform a function or better yet what quantity of functions can be performed with this data, what is the potential energy in the data?

=-=-=-=-=
A golden rule in Natural Language Processing (NLP) and in Hermeneutics, is contextual analysis.   If I want to understand a word I need to analyze its context.  The inverse is perhaps even more true, if I analyze similar words I can create a contextual frame.  So if I have two sentences that share a word I can create a frame of context.

This is probably the same rule that Semiotics defines in the creation of a sign, a meaning created when two forms/symbols are mapped to each other, one form signifies another.

For example, if I have a single sentence, 'that cat ran', in of itself I cannot extract meaning from those words.  However, with two sentences that share a word, such as 'the cat ran' & 'the dog ran', I can leverage the similar context to have the two sentences map to each other, 'the X ran', which then implies that 'cat' & 'dog' are synonymous within this context.

Thus, the repetition of phrases enables meaningful communication.  The more context the more meaning.  One way of thinking of this is that each sentence is a closed system and each time I map a sentence with another sentence I create meaning.  However, that is not all.  There is also a hierarchy.  After mapping all the sentences I can also map paragraphs and again each mapping will create meaning.

What's a good way to measure hierarchical repetition?  Dictionary compression.  Just so happens that another measure of complexity is the LZW complexity.  LZW creates a hierarchical dictionary of repetitions.  The more LZW complexity in a document the more contextual information available.

Let's say I have a document, if the document is purely random it will have a very high Shannon Complexity, high information.  It will also have a very high LZW complexity.  But it will carry zero functional complexity.

So lets introduce a Function Coefficient = Shanon Complexity / LZW complexity ~ equals Kolmogorov complexity

Then the amount of function information in a language is equal to the Functional Coefficient * LZW complexity....

Hence, I can have a message with little information but is meaningful or a message with lots of information that carries multiple meanings.  Or lots of information with little meaning or no information with no meaning.



Comments

Popular posts from this blog

0.0 Introduction to advanced concepts in AI and Machine Learning

Introduction to advanced concepts in AI and Machine Learning I created a set of short videos and blog posts to introduce some advanced ideas in AI and Machine Learning.  It is easier for me to think about them as I met them, chronologically in my life, but I may revisit the ideas later from a different perspective. I also noticed that one of things I am doing is utilising slightly off-centre tools to describe an idea.  So for example, I employ Kohonen Feature Maps to describe embeddings.  I think I gain a couple of things this way, first it is a different perspective than most people are used to.  In addition, well you will see :-) I recommend first opening the blog entry (as per the links below), then concurrently watching the linked video. Hope you enjoy these as much as I did putting them together, David Here are links: https://data-information-meaning.blogspot.com/2020/12/memorization-learning-and-classification.html https://data-information-meaning.blogspot.com/...

III) Metrics

III) Metrics One of these things is not like the other -- but two of these things are distant from a third. I grew up with Brisk Torah, more specifically my father was a Talmid of Rabbi Joseph Soloveichik and dialectic thinking was part and parcel of our discussions.  Two things, two dinim, the rhythm in the flow between two things.  Dialectics not dichotomies.  The idea espoused by the Rambam in his description of Love and Awe, mutually exclusive, we travel between them. Why create duality?  Dialectics or dichotomies provide a powerful tool, but what is it that tool? What is the challenge? I think the Rabbinic language might be נתת דברך לשיעורים, 'your words are given to degrees', the idea being that without clear definitions we are left with vague language, something is more than something else, ok, but how much more? This I think is the reasoning for the first of the twenty one questions I was taught by my father's mother, 'is it bigger than a breadbox?',...

V) How do we know we made a reasonable judgement?

V) How do we know we made a reasonable judgement? I was by my brother in NY, on my way to the airport, and I spotted a book by Umberto Eco on information and open systems.  I borrowed the book (and still have it -- sorry Jacob),  just on the whim that I would enjoy more Eco in my life.  I discovered much more, the book is Eco's earlier writing, semiotics mixed with art and science, and has had a profound affect on me.  Eco makes the argument that Shannon's description of information, a measure of the communicability of a message, provides for a measure of art. If it helps think about 'On Interpretation' by Susan Sontag, experience art without interpreting it.  There is no message not even one that we the viewer creates.   There is no meaning to be had, just an experience.  The flip side of this argument is that when there is interpretation there is meaning.  This view, proposed by Semiotics, states that when two closed systems meet and are ...