Information about http://www.csse.monash.edu.au/~damian/papers/PDF/SevenDeadlySins.pdf

Seven Deadly Sins of Introductory Programming Language…

Tags: civer, damian conway, department of computer science, haskell, introductory programming language, lisp, mers, modula, monash university, programming language design, programming languages, programming skills, prolog, scheme language, selection principles, seven deadly sins, support tools, time program, undesirable features, victoria australia,
Pages: 8
Language: english
Created: Fri Nov 15 12:23:00 2002
Display cached document
Page 1
image
Page 2
image
Page 3
image
Page 4
image
Page 5
image
Page 6
image
Page 7
image
Page 8
image
             Seven Deadly Sins of Introductory Programming Language Design

                                              Linda M cIver & Damian Conway
                                             Department of Computer Science
                                            Monash University, Victoria, Australia


Abstract                                                              The observations and suggestions contained in this pa-
                                                                   per have been developed as part of the GRAIL1 project
   We discuss seven undesirable features common to many            within the Department of Computer Science, Monash
programming languages used to teach first-time program-            University. The aim of this project is to study the early ac-
mers, and illustrate typical pedagogical difficulties which        quisition of programming skills, with the ultimate goal of
stem from them with examples drawn from the program-               creating more usable languages and support tools.
ming languages ABC, Ada, C, C++, Eiffel, Haskell, LISP,
Modula 3, Pascal, Prolog, Scheme, and Turing. We propose           Seven Deadly Sins
seven language design (or selection) principles which may
reduce the incidence of such undesirable features.                 1. Less is more.

Introduction                                                           The "less is more" principle appears in many forms, al-
                                                                   most all of which seem to be ultimately detrimental to the
    Learning to program is difficult. We believe that a sub-       learning process. Perhaps the most obvious examples are
stantial part of this difficulty arises from the structure, syn-   the Scheme language and other LISP variants. Scheme has
tax and semantics of the programming languages which are           effectively only one data type ­ the list ­ and one operation
commonly used to teach programming.                                ­ evaluation of a list. While this abstraction is very simple
    Programming language designers are (of necessity)              to explain, and not difficult for the beginner to grasp super-
highly intelligent experts in the field of programming, and        ficially, it does result in code that is difficult to read because
are consequently far removed both temporally and cogni-            of large numbers of nested parentheses and the absence of
tively from the difficulties experienced by the novice pro-        other structuring punctuation.
grammer. This gulf of experience and ability results in lan-           Furthermore, to support this extreme degree of homo-
guages which are either too restrictive or too powerful (or        geneity, a large number of inbuilt functions are required,
sometimes, paradoxically, both).                                   many of which are quite sophisticated in their behaviour,
    We divide introductory programming languages into              and therefore difficult to understand and use correctly (for
two broad categories: special purpose teaching languages           example: sort vs sortcar in Franz LISP [14]).
(such as Pascal [1], Turing [2], ABC [3]) and popular "real-           A "less is more" approach is usually justified in terms of
world" languages (such as C [4], C++ [5], Ada [6], Modula          paradigmatic purity: strict adherence to a single functional,
3 [7], Haskell [8], and Scheme [9]).                               logical or object oriented paradigm. While this orthodoxy
    There is a long history of scholarly and less-than-schol-      can make for a certain conceptual simplicity and elegance
arly debate [10,11,12,13] regarding the comparative merits         (which can be of considerable advantage in teaching con-
and flaws of many of these languages. Typically this debate        cepts such as scoping, recursion and encapsulation), in prac-
centres on theoretical issues (such as expressiveness, range       tice it can also lead to extremely obscure and unreadable
of concepts supported, or paradigmatic integrity) and              code. In some cases, relatively simple programs must be
practical considerations (such as range of available plat-         substantially restructured to achieve even basic effects such
forms, support tools and environments, or efficiency). This        as input and output.
paper departs from that tradition in that it focuses exclu-            The underlying pedagogical difficulty is that students
sively on the issues which arise in the context of teaching        are not used to solving problems in a single pure paradigm.
introductory programming.                                          Much of the problem solving they do in the real world is
    We enumerate seven serious pedagogical problems,               procedural in nature (cooking a meal, totalling a restaurant
each of which is common to most or all of the above-men-           bill, etc.), but other problems with which they are familiar
tioned languages, even those specifically designed for             are more amenable to constraint solving (dispute resolution,
teaching. We also propose a like number of design princi-          holiday planning, budgeting), a functional or pipe-line ap-
ples, which we believe will lead to the development of sig-        proach (collaborative tasks, various types of component as-
nificantly more "teachable" introductory programming lan-          sembly), or even object-oriented methods (using an auto-
guages. Finally we suggest seven criteria which educators          matic teller machine, learning physical skills).
may find useful in evaluating existing languages for intro-            The results of enforcing paradigmatic purity can be as
ductory teaching.                                                  simple as the annoying requirement in Turing that functions
                                                                   have no side-effects, or as far reaching as the mysteries of
                                                                   1 Genuinely Readable And Intuitive Languages
I/O in Haskell, which sometimes necessitate the warping of       these constructs arise from the constraints of the ASCII
the entire structure of an otherwise elegant and comprehen-      character set, whilst others are the result of a deliberate but
sible functional program.                                        (in our view) misguided "less-is-more" design policy. The
                                                                 common feature of these problems is that they are analo-
2. More is more.                                                 gous to certain sophisticated grammatical constructs in nat-
                                                                 ural languages, and result in the same types of learning
    It is equally true that many languages are based on de-      problems as are seen in natural language acquisition.
sign philosophies which err in the other extreme. Powerful,          One such construct is the syntactic synonym, in which
real world languages (C++ and Ada, for example) are              two or more syntaxes are available to specify a single con-
amongst the prime culprits here. Often such languages are        struct. An common example of this is dynamic array access
taught by subsetting ­ teaching a small but usable part of       in C, wherein the second2 element of an array may be ac-
the language whilst ignoring its more powerful features.         cessed by any of the following syntaxes, some of which are
    At first glance this approach seems quite reasonable, but    legal only in certain contexts:
two pedagogical problems frequently sabotage it. The first           array[1]       *(array+1)        1[array]        *++array
is that textbooks and other reference materials rarely con-          A less well-known example is list construction. In
fine themselves to the selected subset. The second is that,      Haskell the list construction expression [1,2,3] is syn-
even if the textbook does limit itself to the required subset,   onymous with 1:(2:(3:[])). In Prolog [1,2,3] is
the compiler almost certainly does not. The result is often      equivalent to .(1,.(2,.(3,[]))) and both produce a
worse than if the complete language was taught: students         third form on evaluation: (1,2,3).
must still contend with the full semantics of the language,          In themselves, synonyms are minor irritants (which
but much of it has deliberately not been explained to them!      multiply the learning curve for particular constructs by no
    C++ is certainly one of the most popular languages in        more than 200­300%) However, they can also have a more
"real-world" use and (for that very reason) is also increas-     serious and insidious effect by blurring the underlying pro-
ingly widely taught as an introductory language. One of the      gramming concept in the student's mind, because that con-
justifications typically cited for teaching C++ [11] is the      cept is no longer associated with a single clear syntax.
range of low- and high-level features it provides (from bit          Syntactic homonyms (constructs which are syntactically
manipulation of raw pointers, to templated abstract classes      the same, but which have two or more different semantics
with polymorphic member functions).                              depending on context) are perhaps a more serious flaw in a
    Beginners, however, are notoriously poor at maintaining      language. An extreme example of this3 may be seen in the
two or more conceptual perspectives simultaneously [15].         pedagogically-oriented language Turing, in which the con-
Dichotomies of perspective (such as syntax vs semantics,         struct A(B) has five distinct meanings:
static vs dynamic structure, process vs data) complicate the         · call function A and pass parameter B
teaching of any programming language. The availability of            · dereference the pointer B to access an object in collec-
very low-level implementation-oriented constructs and                   tion A
high-level solution-oriented features in a single language           · access the Bth element of array A
only serves to increase substantially the already consider-          · construct a set of type A with a single element having
able cognitive demands placed on the student.                           the value of B
    As well as the obvious concerns regarding learning               · create a one-letter substring of the string A consisting
curves, confusion of levels, and the difficulties of adequate           of its Bth character
error detection, a wide range of features necessitates a             The student, armed with only a fuzzy understanding of
commensurately complex syntax and often also entails a           the differences between these concepts, finds no support
host of implicit operations and function calls, automatic        from the syntax. It should be noted that the decision to over-
conversions, type inferences, and resolutions of overloaded      load this construct was taken quite deliberately and on peda-
functions, variable and function scoping.                        gogical grounds:
    Examples of this "creeping featuritis" are easy to cite:
C++ provides over 50 distinct operators at 17 levels of              "Notice that referencing an element of array a with
precedence, Ada9X has 68 reserved words and over 50 pre-             subscript i as in a(i) is notationally equivalent to
defined attributes, Modula 3 reserves over 100 keywords,             [the pointer dereference] c(p). This is an example
and some commonly-used LISP dialects ([14] for example)              of uniform referents, which means that analogous
define over 500 special functions. Because most textbooks            ways of accessing data should be notationally
and compilers attempt to cover the full language, novice             equivalent." [17]
programmers are forced to contend with all of these fea-             Another difficult grammatical construct which fre-
tures, even if they are not using them.                          quently appears in languages is elision (the omission of a
                                                                 syntactic component). C is well known for its default inte-
3. Grammatical traps.                                            ger return value and its curious string literal concatenation

   Another class of pedagogical problems stems from vari-        2 The fact that array[1] refers to the second element of array is itself a
                                                                   grammatical trap.
ous kinds of confusing syntactic and semantic constructs         3 But not as extreme as LISP and its variants, which could be viewed as
which are present in most introductory languages. Some of          one massive homonym.
behaviour, but default behaviours occur in many languages:       their ancestors), nonetheless languages which attempt a
automated sorting of lists in ABC, type inference in Haskell     significant degree of historical consistency inevitably per-
and Turing, LISP superbrackets, switch fall-through in           petuate some problematical constructs.
C++, etc.                                                            Language designers occasionally acknowledge the prob-
                                                                 lems that their quest for genetic compatibility produces:
4. Hardware dependence.                                              "At this point, the reader may be confused at having
                                                                     so many ways to define a [Haskell] function! The
    In addition to battling the various syntactic and semantic       decision to provide these mechanisms partly reflects
levels of an introductory language, the novice programmer            historical conventions, and partly reflects the desire
is often forced to contend simultaneously with the con-              for consistency (for example, in the treatment of
straints of the underlying hardware (sometimes merely for            infix vs. regular functions)." [8]
the convenience of the compiler writer).
                                                                     "Over the years, C++'s greatest strength and its
    This "closeness to the metal" is particularly noticeable
                                                                     greatest weakness has been its C compatibility. This
in the design and implementation of basic numerical and
                                                                     came as no surprise." [16]
character string types. There seems no convincing reason
why the novice student, already struggling to master the             As well as introducing (or compounding) the problems
syntax and semantics of various constructs, should also be       inherent in a "more-is-more" approach, the addition of new
forced to deal with the details of representational precision,   concepts to an old language often leads to inconsistency of
varying machine word sizes, awkward memory models, or a          abstraction (consider the differing semantics of TEXT and
profusion of conceptually-equivalent but semantically-dis-       other array types in Modula 3), the creation of synonyms or
tinct data types.                                                homonyms (array indexing, function calls and pointer
    The semantics of arrays in Pascal, in which the novice       dereference in Turing), as well as the perpetuation of out-
must grapple with the fundamental type difference of arrays      moded or flawed constructs (such as char* strings in C++)
of different lengths, is a notable example. The presence of      or syntax (for example, the inexplicably-named5 car and
thirty-two distinct numerical data types in C/C++4 is an-        cdr which Scheme inherits from LISP).
other. These types are particularly problematical in C as            Not all syntactic or semantic inheritance stems from de-
they are generally not portable. The standard int type, for      liberate decisions regarding backwards compatibility. Some
example, varies from 16 to 32 bit representations depending      constructs and symbols seem to propagate memetically
on the machine and the implementation. This can lead to          across language family boundaries, and have become de
strange and unexpected errors when overflow occurs. A            facto standards within the programming community. This is
student whose program attempts to add a $4,000 bonus to a        often despite the fact that such constructs may have been
$30,000 salary may be justifiably confused to find that the      viewed by their progenitors as ad hoc and may indeed have
result is a negative number.                                     no discernible connection with the concepts they are in-
                                                                 tended to represent. Memetic compatibility is surprisingly
5. Backwards Compatibility.                                      pervasive and may be seen in the widespread use of "stan-
                                                                 dard" symbols such as * for multiplication, = or := for as-
    Backwards compatibility is a useful property from the        signment, array[index] for indexing.
experienced programmer's point of view, as it promotes               The major pedagogical problem with the presence of
reuse of both code and programming skills. The novice            such syntactic memes is that they significantly reduce the
however can take no advantage of these benefits and must         degree to which the novice, an outsider to the programmer
instead bear the pedagogical costs they entail.                  culture, can rely on existing knowledge (such as × mean-
    Backwards compatibility comes in two major forms:            ing multiply, or a subscript representing an index).
"genetic" and "memetic". Whilst both forms can lead to               Unfortunately, memetic compatibilities can also be par-
pedagogically suspect decisions during the design of a lan-      ticularly difficult to identify (and their pedagogical effects
guage, genetic compatibility is generally the result of a con-   correspondingly hard to analyse), precisely because both the
scious decision on the part of the language designers,           language designer and the programming teacher are so
whereas memetic compatibility is frequently inadvertent.         familiar with them.
    Genetic compatibility is exemplified by the relationship
between languages such as C++ and C [16], Scheme and             6. Excessive Cleverness.
LISP [9], and Turing, Euclid and Pascal[17] , and results
from the decision to retain the semantics and often the              Instances of "excessive" cleverness can be difficult to
general syntactic "look-and-feel" of an ancestor language.       spot, precisely because the "excess" exists only relative to
Genetic compatibility need not of course imply the near          the knowledge level of the novice. Frequently the only way
complete backwards compatibility as seen in the C/C++ re-        to detect excessive cleverness is to see a novice program-
lationship (Turing and Scheme differ significantly from          mer's complete misunderstanding of an "obvious" concept.

4 int, short, long, unsigned int, unsigned short, unsigned       5 "Inexplicable" in the sense that explaining that they derive from
  long, float and double; plus three const and/or volatile        "contents of address register" and "contents of decrement register"
  variants of each.                                               respectively, rarely assists the student's comprehension or recall.
    The premier example of the adverse effects of clever-       designer can commit. Some examples are very well-known,
ness in programming language design (and one which is           such as the almost maliciously designed C/C++:
obvious to programmers at all skill levels) must surely be         if (x=0 || -10