Measuring intelligence

Listening to Dr. Stephen C. Meyer, here is a quote from the web:

“But it is also necessary to distinguish Shannon information from information that performs a function or conveys a meaning. We must distinguish sequences of characters that are (a) merely improbable from sequences that are (b) improbable and also specifically arranged so as to perform a function. That is, we must distinguish information-carrying capacity from functional information.”
― Stephen C. Meyer, Signature in the Cell: DNA and the Evidence for Intelligent Design

Meyers speaks of 'information that performs a function or conveys a meaning', I think we are employing different definitions of those key words. Let me try and define his message with my terms.

Data: a recorded observation. Data is a thing, an object, as perceived.

Information: a metric of data. Just like we measure the weight of things in pounds or grams, we can measure that quantity of data in Information.

Meaning: the result of mapping two systems of data to each other -- this is the semiotic definition of meaning.

So within my terms, information can not convey a meaning, just like grams cannot convey anything about what is being measured, if I say that an apple weighs 150 grams, in of itself this is meaningless. I need to say an apple weighs 10 grams more than a tangerine. Similarly, information can not perform a function, just like 10 grams of a an apple do not perform a function. Metrics describe a relative relationship between data elements. An orange weighs more than a tangerine. A document has more information than a tweet.

Complexity is a funny word, on the one hand it is synonymous with information, Shannon Information is equivalent to Shannon Complexity. On the other hand the word complexity is also utilized to describe a function. Kolmogorov complexity is a measure of how complex a program would need to be to create the data.

So it would be easy to rephrase Meyer's statement as: DNA is a complex message, this is proven by the measure of Kolmogorov complexity. However, Kolmogorov complexity is actually the thing we are trying to describe, so it would be self referential. Right, since Kolmogorov complexity requires an intelligent designer to create the program. What we are looking for is a clean way of objectively describing the information in a sequence of text/DNA, such that we can differentiate between Kolmogorov complex sequences vs non-Kolmogorov complex sequences.

So how to measure the functionality of data, i.e. how much work does the data contain? Can this data perform a function or better yet what quantity of functions can be performed with this data, what is the potential energy in the data?

=-=-=-=-=
A golden rule in Natural Language Processing (NLP) and in Hermeneutics, is contextual analysis. If I want to understand a word I need to analyze its context. The inverse is perhaps even more true, if I analyze similar words I can create a contextual frame. So if I have two sentences that share a word I can create a frame of context.

This is probably the same rule that Semiotics defines in the creation of a sign, a meaning created when two forms/symbols are mapped to each other, one form signifies another.

For example, if I have a single sentence, 'that cat ran', in of itself I cannot extract meaning from those words. However, with two sentences that share a word, such as 'the cat ran' & 'the dog ran', I can leverage the similar context to have the two sentences map to each other, 'the X ran', which then implies that 'cat' & 'dog' are synonymous within this context.

Thus, the repetition of phrases enables meaningful communication. The more context the more meaning. One way of thinking of this is that each sentence is a closed system and each time I map a sentence with another sentence I create meaning. However, that is not all. There is also a hierarchy. After mapping all the sentences I can also map paragraphs and again each mapping will create meaning.

What's a good way to measure hierarchical repetition? Dictionary compression. Just so happens that another measure of complexity is the LZW complexity. LZW creates a hierarchical dictionary of repetitions. The more LZW complexity in a document the more contextual information available.

Let's say I have a document, if the document is purely random it will have a very high Shannon Complexity, high information. It will also have a very high LZW complexity. But it will carry zero functional complexity.

So lets introduce a Function Coefficient = Shanon Complexity / LZW complexity ~ equals Kolmogorov complexity

Then the amount of function information in a language is equal to the Functional Coefficient * LZW complexity....

Hence, I can have a message with little information but is meaningful or a message with lots of information that carries multiple meanings. Or lots of information with little meaning or no information with no meaning.

Too much data not enough information: a survival guide to the information age

Search This Blog

Measuring intelligence

Comments

Post a Comment

Popular posts from this blog

III) Metrics

0.0 Introduction to advanced concepts in AI and Machine Learning

V) How do we know we made a reasonable judgement?