| Connectionism
Connectionism is an approach in the fields of cognitive science,
neuroscience, psychology and philosophy of mind. Connectionism models
mental or behavioral phenomena as the emergent processes of interconnected
networks of simple units. There are many different forms of connectionism,
but the most common forms utilize neural network models.
Basic principles
The central connectionist principle is that mental phenomena can
be described by interconnected networks of simple units. The form
of the connections and the units can vary from model to model. For
example, units in the network could represent neurons and the connections
could represent synapses. Another model might make each unit in
the network a word, and each connection an indication of semantic
similarity.
Spreading activation
Most connectionist models include time, i.e. there is a variable
which represents time and the network changes over time. A closely
related and extremely common aspect of connectionist models is activation.
At any time a unit in the network has an activation, which is a
numerical number intended to represent some aspect of the unit.
For example, if the units in the model are neurons the activation
could represent the probability that the neuron would generate an
action potential spike. If the model is a spreading activation model
then over time a unit's activation spreads to all the other units
connected to it. Spreading activation is always a feature of neural
network connectionist models.
Neural networks
Main article: Neural networks
Neural networks are by far the dominant form of connectionist model
today. A lot of research utilizing neural networks is carried out
under the more general name "connectionist". These connectionist
models adhere to two major principles regarding the mind:
Any given mental state can be described as a (N)-dimensional vector
of numeric activation values over neural units in a network.
Memory is created by modifying the strength of the connections between
neural units. The connection strengths, or "weights",
are generally represented as a (N×N)-dimensional matrix.
Though there is a large variety of neural network models, they very
rarely stray from these two basic principles. Most of the variety
comes from:
Interpretation of units—units can be interpreted as neurons
or groups of neurons.
Definition of activation—activation can be defined in a variety
of fashions. For example, in a Boltzmann machine, the activation
is interpreted as the probability of generating an action potential
spike, and it's determined via a logistic function on the sum of
the inputs to a unit.
Learning algorithm—different networks modify their connections
differently. Generally, any mathematically defined change in connection
weights over time is referred to as the "learning algorithm".
Connectionists are generally in agreement that recurrent neural
networks (networks wherein connections of the network can form a
directed cycle) are a better model of the brain than feedforward
neural networks (networks with no directed cycles). A lot of recurrent
connectionist models incorporate dynamical systems theory as well.
Many researchers, such as the connectionist Paul Smolensky, have
argued that the direction connectionist models will take is towards
fully continuous, high-dimensional, non-linear, dynamic systems
approaches.
Biological realism
The neural network branch of connectionism suggests that the study
of mental activity is really the study of neural systems. This links
connectionism to neuroscience, and models involve varying degrees
of biological realism. Connectionist work in general need not be
biologically realistic, but some neural network researchers try
to model the biological aspects of natural neural systems very closely.
As well, many authors find the clear link between neural activity
and cognition to be an appealing aspect of connectionism. However,
this is also a source of criticism, as some people view this as
reductionism.
Learning
Connectionists generally stress the importance of learning in their
models. As a result, many sophisticated learning procedures for
neural networks have been developed by connectionists. Learning
always involves modifying the connection weights. These generally
involve mathematical formula to determine the change in weights
when given sets of data consisting of activation vectors for some
subset of the neural units.
By formalizing learning in such a way connectionists have many
tools at their hands. A very common tactic in connectionist learning
methods is to incorporate gradient descent over an error surface
in a space defined by the weight matrix. All gradient descent learning
in connectionist models involves changing each weight by the partial
derivative of the error surface with respect to the weight. Backpropagation,
first made popular in the 1980s, is probably the most commonly known
connectionist gradient descent algorithm today.
History
Connectionism can be traced back to ideas more than a century
old. However, connectionist ideas were little more than speculation
until the mid-to-late 20th century. It wasn't until the 1980's that
connectionism became a popular perspective amongst scientists.
Parallel distributed processing
The prevailing connectionist approach today was originally known
as Parallel Distributed Processing (PDP). PDP was a neural network
approach that stressed the parallel nature of neural processing,
and the distributed nature of neural representations.
PDP provided a general mathematical framework for researchers to
operate in. The framework involved eight major aspects:
- A set of processing units, represented by a set of integers.
- An activation for each unit, represented by a vector of time-dependent
functions.
- An output function for each unit, represented by a vector of
functions on the activations.
- A pattern of connectivity among units, represented by a matrix
of real numbers indicating connection strength.
- A propagation rule spreading the activations via the connections,
represented by a function on the output of the units.
- An activation rule for combining inputs to a unit to determine
its new activation, represented by a function on the current activation
and propagation.
- A learning rule for modifying connections based on experience,
represented by a change in the weights based on any number of
variables.
- An environment which provides the system with experience, represented
by sets of activation vectors for some subset of the units.
These eight aspects are now the foundation for almost all connectionist
models.
A lot of the research that led to the development of PDP was done
in the 1970s, but PDP became popular in the 1980s with the release
of Parallel Distributed Processing: Explorations in the Microstructure
of Cognition - Volume 1 (foundations) & Volume 2 (Psychological
and Biological Models), by James L. McClelland, David E. Rumelhart,
and the PDP Research Group. Though the books are now considered
seminal connectionist works the term "connectionism" was
not used by the authors to describe their framework at that point.
However it is now common to fully equate PDP and connectionism.
Earlier work
PDP's direct roots were the perceptron theories of researchers
such as Frank Rosenblatt from the 1950s and 1960s. However, perceptron
models were made very unpopular with the release in 1969 of a book
titled Perceptrons by Marvin Minsky and Seymour Papert. Minsky and
Papert elegantly demonstrated the limits on the sorts of functions
which perceptrons can calculate, showing that even simple functions
like the exclusive disjunction could not be handled properly. The
PDP books overcame this earlier limitation by showing that multi-level,
non-linear neural networks were far more robust and could be used
for a vast array of functions.
However, there were many researchers outside of the perceptron
theorists who were advocating connectionist style models prior to
the 1980s. As early as 1869, the neurologist John Hughlings Jackson
was arguing for multi-level, distributed systems.
In the 1940s and 1950s researchers such as Warren McCulloch, Walter
Pitts, Donald Hebb, and Karl Lashley were advocating connectionist
style theories. McCullough and Pitts showed how first-order logic
could be implemented by neural systems. Hebb contributed greatly
to speculations about neural functioning, and even proposed a learning
principle that is still in use today, known as Hebbian learning.
Lashley argued for distributed representations as a result of his
failure to find anything like a localized engram in years of lesion
experiments.
Connectionism apart from PDP
Though PDP is the dominant form of connectionism, other theorists'
work should be classified as connectionist.
Many connectionist principles can be traced back to early work
in psychology such as the work of William James, who set up one
of the first psychology labs in North America, and Edward Thorndike,
a turn of the century psychologist who studied learning.
In the 1950s the researcher Friedrich Hayek posited the idea of
spontaneous order in the brain arising out of decentralized networks
of simple units, but Hayek's work was not cited in the PDP literature.
Another form of connectionist model was the relational network
framework developed by the linguist Sydney Lamb in the 1960s. Relational
networks have only ever been used by linguists, and have never been
unified with the PDP approach. As a result, relational networks
are used by very few researchers today.
Connectionism vs. computationalism debate
As connectionism became increasingly popular in the late 1980s
there was a reaction against connectionism by some researchers,
including Jerry Fodor, Steven Pinker, and many others. These theorists
argued that connectionism, as it was being developed at that time,
was in danger of obliterating the progress made in the fields of
cognitive science and psychology by the classical approach of computationalism.
Computationalism is a specific form of cognitivism which argues
that mental activity is computational, i.e. that the mind is essentially
a Turing machine. Many researchers argued that the trend in connectionism
was towards a reversion to associationism, and the abandonment of
the idea of a language of thought, something they felt was mistaken.
On the other hand, it was those very tendencies that made connectionism
attractive for other researchers.
Connectionism and computationalism need not be at odds per se,
but the debate as it was phrased in the late 1980s and early 1990s
certainly led to opposition between the two approaches. However,
throughout the debate some researchers have argued that connectionism
and computationalism are fully compatible, but nothing like a consensus
has ever been reached. The differences between the two approaches
that are usually cited are are the following:
- Computationalists posit symbolic models that do not resemble
underlying brain structure at all, whereas connectionists engage
in "low level" modeling, trying to ensure that their
models resemble neurological structures.
- Computationalists generally focus on the structure of explicit
symbols (mental models) and syntactical rules for their internal
manipulation, whereas connectionists focus on learning from environmental
stimuli and storing this information in a form of connections
between neurons.
- Computationalists believe that internal mental activity consists
of manipulation of explicit symbols, whereas connectionists believe
that the manipulation of explicit symbols is a poor model of mental
activity.
Though these differences do exist, they may not be necessary.
For example, it is well known that connectionist models can actually
implement symbol manipulation systems of the kind used in computationalist
models. So, the differences might be a matter of the personal choices
that some connectionist researchers make as opposed to anything
fundamental to connectionism.
To make matters more complicated, the recent popularity of dynamical
systems in philosophy of mind (due to the works of authors such
as Tim Van Gelder) have added a new perspective on the debate. Some
authors now argue that any split between connectionism and computationalism
is really just a split between computationalism and dynamical systems,
suggesting that the original debate was wholly misguided.
All of these opposing views have led to a fair amount of discussion
on the issue amongst researchers, and it is likely that the debates
will continue.
|