Literate programming is a philosophy of
computer programming based on the premise that a
computer program should be written similar to
literature, with human readability as a primary goal. According to this philosophy, programmers should aim for a “literate” style in their programming just as writers aim for an intelligible and articulate style in their writing. This philosophy contrasts with the mainstream view that the programmer’s primary or sole objective is to create source code and that documentation should only be a secondary objective.
In practice, literate programming is achieved by combining human-readable
documentation and machine-readable
source code into a single
source file, in order to maintain close correspondence between documentation and source code. The order and structure of this source file are specifically designed to aid human comprehension: code and documentation together are organized in logical and/or hierarchical order (typically according to a scheme that accommodates detailed explanations and commentary as necessary). At the same time, the structure and format of the source files accommodate external utilities that generate program documentation and/or extract the machine-readable code from the same source file(s) (
e. g., for subsequent processing by compilers or interpreters).
History and current implementations
The first published literate programming environment was
WEB, introduced by
Donald Knuth in
1981 for his
TeX typesetting system; it uses
Pascal as its underlying programming language and TeX for typesetting of the documentation.
The complete commented TeX source code was published in Knuth's
TeX: The program, volume B of his 5-volume
Computers and Typesetting. Knuth had internally used a literate programming system called DOC as early as
1979; he was inspired by the ideas of
Pierre Arnoul de Marneffe. The free
CWEB, written by Knuth and Levy, is WEB adapted for
C and
C++, runs on most operating systems and can produce TeX and
PDF documentation. Other implementations of the concept are
noweb and
FunnelWeb.
Related concepts
Outlining
Outlining editors are sometimes seen as providing a variant of the original concept of literate programming as used by Knuth. In particular,
Leo combines outlining with interfaces to noweb and CWEB processors.
Embedded documentation
There are also less powerful systems to integrate documentation and code than literate programming; examples are pod for
perl, doc++ for C, C++ and Java,
javadoc for
Java, and
Doxygen for many languages. See
documentation generator.
These however do not quite follow the literate programming philosophy since they typically just produce documentation
about the program, such as specifications of functions and parameters, and not documentation
of the program source code itself. They also do not allow rearrangement of presentation order, which is critical to the effectiveness of literate programming.
Haskell is a modern language that makes use of a limited form of literate programming: this
semi-literate style does not allow code re-ordering or multiple expansion of definitions but lets the programmer intersperse documentation and code freely.
It is the fact that documentation can be written freely whereas code must be marked in a special way (see the example below) that makes the difference between semi-literate programming and excessive documenting, where the documentation is embedded into the code as comments.
Doctests
Similarly to source code, testing code that exercises an API can be embedded within human-readable documentation,
along with the expected output of the calls. A test runner extracts and executes the code and verifies its output
against the expected output. This idea originated from the
Python programming language. An implementation is provided by the Python standard library's
doctest module.
Example of a simple literate program and interpreter
Program
This section contains a literate program, which can be run using the example literate interpreter in the Interpreter section
For this particular interpreter, all the program code must be written on lines starting with a dash. Everything else is ignored by the interpreter. This does not support some important aspects of advanced literate programming like code rearrangements or multiple expansion and so should only be called basic literate programming or "semi-literate" literate programming.
Program to calculate the area of a circle and rectangle
Firstly, in the interests of putting the user at ease, the program will simulate personal interest in the user by asking for their name, accepting the input and generating a greeting based on the input text.
>
- clearscreen
- print text Please type your name:
- store input
- print Hello there,
- print value
- print . Nice to meet you.
- newline
- newline
Continuing the "query-response" mode of operation, prompt the user for the radius of a circle, which is then used to calculate the area of a circle using the standard formula for the area of a circle:
. Due to syntax limitations, this is done by multiplying the input value by itself, then by
. This calculated value is returned to the user.
Note: the value of
used is an approximation that is sufficiently accurate for our purposes.
>
- print text Let's work out the area of a circle.
- newline
- print text Please enter the radius of the circle in furlongs:
- store input
- multiplyby value
- multiplyby 3.14159
- print Thank you
- newline
- print text The area of the circle is
- print value
- print text square furlongs.
- newline
- newline
Finally the user is asked for the required information, and the area of the rectangle is worked out using the standard width by height formula.
>
- print text Now let's work out the area of a rectangle.
- newline
- print text Please enter the width of the rectangle in ells:
- store input
- print text Please enter the length of the rectangle in ells:
- multiplyby input
- print Thank you
- newline
- print text The area of the rectangle is
- print value
- print text square ells.
- newline
- newline
- print text Goodbye,
Interpreter
The following simple interpreter program is written using BASIC. When compiled using the
QuickBASIC compiler it is a straightforward interpreter but when run on the
QBASIC interpreter, it is an example of an interpreted interpreter.
>
DECLARE SUB SplitFirst (aFirst AS STRING, aRest AS STRING)
LET Q$ = "TESTPROG.TXT"
LET F = FREEFILE
OPEN Q$ FOR INPUT AS #F
DO WHILE NOT EOF(F)
LINE INPUT #F, FileInput$
LET FileInput$ = LTRIM$(FileInput$)
SplitFirst KeyWord$, FileInput$
SELECT CASE KeyWord$
CASE "-"
SplitFirst KeyWord$, FileInput$
GOSUB InterpretKeyword
END SELECT
LOOP
CLOSE #F
SYSTEM
InterpretKeyword:
SELECT CASE UCASE$(KeyWord$)
CASE "STORE"
GOSUB AssignToValue
CASE "ADD"
GOSUB AddToValue
CASE "MULTIPLYBY"
GOSUB MultiplyWithValue
CASE "PRINT"
GOSUB PutOutput
CASE "CLEARSCREEN"
GOSUB ClearScreen
CASE "NEWLINE"
PRINT
CASE ELSE
PRINT
PRINT "I don't know what "; KeyWord$; " "; FileInput$; " means."
END SELECT
RETURN
AssignToValue:
GOSUB GetArg
LET Value$ = Arg$
RETURN
AddToValue:
GOSUB GetArg
LET Value$ = LTRIM$(STR$(VAL(Value$) + VAL(Arg$)))
RETURN
MultiplyWithValue:
GOSUB GetArg
LET Value$ = LTRIM$(STR$(VAL(Value$) * VAL(Arg$)))
RETURN
GetArg:
Split KeyWord$, FileInput$
SELECT CASE UCASE$(KeyWord$)
CASE "INPUT"
GOSUB GetInput
LET Arg$ = UserInput$
CASE "VALUE"
LET Arg$ = Value$
CASE "TEXT"
LET Arg$ = FileInput$
CASE ELSE
LET Arg$ = KeyWord$ + " " + FileInput$
END SELECT
RETURN
GetInput:
LINE INPUT "", UserInput$
RETURN
PutOutput:
GOSUB GetArg
PRINT Arg$; " ";
RETURN
ClearScreen:
CLS
RETURN
NewLine:
PRINT
RETURN
SUB SplitFirst (aFirst AS STRING, aRest AS STRING)
DIM J AS INTEGER
LET J = INSTR(aRest + " ", " ")
LET aFirst = LTRIM$(LEFT$(aRest, J - 1))
LET aRest = LTRIM$(MID$(aRest, J))
END SUB
To try with the above program, save the interpreter as INTERP.BAS and then save the program section as TESTPROG.TXT in the same folder. Run INTERP.BAS with either QBASIC or QuickBASIC.
See also
References
- Eitan M. Guari, TeX & LATeX Drawing and Literate Programming. McGraw Hill 1994. ISBN 0-07-911616-7 (includes software).
- Donald E. Knuth, Literate Programming, Stanford, California: Center for the Study of Language and Information, 1992, CSLI Lecture Notes, no. 27.
- Pierre Arnoul de Marneffe, Holon Programming. Univ. de Liege, Service d'Informatique (December, 1973).
External links
Computer programming (often shortened to programming or coding) is the process of writing, testing, and maintaining the source code of computer programs. The source code is written in a programming language.
..... Click the link for more information.
A computer program is one or more instructions that are intended for execution by a computer. Specifically, it is a symbol or combination of symbols forming an algorithm that may or may not terminate, and that algorithm is written in a programming language.
..... Click the link for more information.
Literature literally "acquaintance with letters" (from Latin littera letter) as in the first sense given in the Oxford English Dictionary, or works of art, which in Western culture are mainly prose, both fiction and non-fiction, drama and poetry.
..... Click the link for more information.
Software documentation or source code documentation is written text that accompanies computer software. It either explains how it operates or how to use it, and may mean different things to people in different roles.
..... Click the link for more information.
source code (commonly just source or code) is any sequence of statements and/or declarations written in some human-readable computer programming language.
..... Click the link for more information.
source code (commonly just source or code) is any sequence of statements and/or declarations written in some human-readable computer programming language.
..... Click the link for more information.
WEB is a computer programming system created by Donald Knuth as the first implementation of what he called "literate programming": the idea that one could create software as works of literature, by embedding source code inside descriptive text, rather than the reverse (as is common
..... Click the link for more information.
Donald Ervin Knuth
Photographed by Jacob Appelbaum, 25 October 2005
Born January 10 1938 (1938--)
..... Click the link for more information.
19th century - 20th century - 21st century
1950s 1960s 1970s - 1980s - 1990s 2000s 2010s
1978 1979 1980 - 1981 - 1982 1983 1984
Year 1981 (MCMLXXXI
..... Click the link for more information.
Tex may refer to:
- Tex (unit), a unit of measure for the linear mass density of fibers
- TeX, a typesetting system created by Donald Knuth
- Tau Epsilon Chi high school sorority
People
Tex..... Click the link for more information. Pascal is a structured imperative computer programming language, developed in 1970 by Niklaus Wirth as a language particularly suitable for structured programming. A derivative known as Object Pascal was designed for object oriented programming.
..... Click the link for more information.
Computers and Typesetting is a 5-volume set of books by Donald Knuth describing the TeX and Metafont systems for digital typography. Knuth's computers and typesetting project was the result of his frustration with the lack of decent software for the typesetting of
..... Click the link for more information.
19th century -
20th century - 21st century
1940s 1950s 1960s - 1970s - 1980s 1990s 2000s
1976 1977 1978 -
1979 - 1980 1981 1982
- Also: 1979 by Smashing Pumpkins.
..... Click the link for more information. Pierre-Arnoul de Marneffe is a Belgian computer scientist and professor at the University of Liege (ULg). He studied civil engineering and obtained a PhD in applied sciences at the ULg (1976), in addition he obtained a Ph.D. in Computer science at Cambridge University in 1982.
..... Click the link for more information.
CWEB is a computer programming system created by Donald Knuth and Silvio Levy as a follow up to Knuth's WEB literate programming system, using the C programming language (and to a lesser extent the C++ and Java programming languages) instead of Pascal.
..... Click the link for more information.
C
The C Programming Language, Brian Kernighan and Dennis Ritchie, the original edition that served for many years as an informal specification of the language.
..... Click the link for more information.
C++
Paradigm: Multi-paradigm
Appeared in: 1983
Designed by: Bjarne Stroustrup
Typing discipline: Static, unsafe, nominative
Major implementations: G++, Microsoft Visual C++, Borland C++ Builder
Dialects: ISO/IEC C++ 1998, ISO/IEC C++ 2003
..... Click the link for more information.
Portable Document Format (PDF)
Adobe Reader displaying a PDF in Microsoft Windows Vista
File extension: .pdf
MIME type: application/pdf
Type code: 'PDF ' (including a single space)
..... Click the link for more information.
noweb is a free literate programming tool, created in 1989-1999 by Norman Ramsey [1] , and designed to be simple, easily extensible and language independent.
Like in WEB and CWEB main components of noweb are two programs: "notangle
..... Click the link for more information.
FunnelWeb is a free software literate programming tool designed to be a powerful literate-programming macro preprocessor that enables you to weave programs and documentation together.
It was created from 1992-1999 ( with some pre-history starting in 1983) by Dr. Ross N.
..... Click the link for more information.
Leo (Literate Editor with Outlines) is a text editor that features outlines with clones as its central tool of organization and navigation.
Language
Leo is written in Python and uses the Tk GUI toolkit.
..... Click the link for more information. Perl
Paradigm: Multi-paradigm
Appeared in: 1987
Designed by: Larry Wall
Latest release: 5.8.8/ January 31 2006
Typing discipline: Dynamic
Influenced by: AWK, BASIC, BASIC-PLUS, C, C++, Lisp, Pascal, Python, sed, Unix shell
..... Click the link for more information.
Javadoc is a computer software tool from Sun Microsystems for generating API documentation in HTML format from Java source code.
Javadoc is the industry standard for documenting Java classes. Most IDEs will automatically generate Javadoc HTML.
..... Click the link for more information.
Java
Paradigm: Object-oriented, structured, imperative
Appeared in: 1995
Designed by: Sun Microsystems
Typing discipline: Static, strong, safe, nominative
Major implementations: Numerous
Influenced by: Objective-C, C++, Smalltalk, Eiffel,[1]
..... Click the link for more information.
Doxygen is a documentation generator for C++, C, Java, Objective-C, Python, IDL (Corba and Microsoft flavors) and to some extent PHP, C#, D and ActionScript. It runs on most Unix-like systems as well as on Windows and Mac OS X.
..... Click the link for more information.
A documentation generator is a programming tool that generates documentation intended for programmers (API documentation) or end users (End-user Guide), or both, from a set of specially commented source code files, and in some cases, binary files.
..... Click the link for more information.
Haskell
Paradigm: functional, non-strict, modular
Appeared in: 1990
Designed by: Simon Peyton-Jones, Paul Hudak[1], Philip Wadler, et al
Typing discipline: static, strong, inferred
Major implementations: GHC, Hugs, NHC , JHC , Yhc
..... Click the link for more information.
Python
Paradigm: Multi-paradigm
Appeared in: 1991
Designed by: Guido van Rossum
Developer: Python Software Foundation
Latest release: 2.5.1/ April 18 2007
Latest unstable release: 3.
..... Click the link for more information.
doctest is a module included in the Python programming language's standard library that allows the easy generation of tests based on output from the standard Python interpreter shell, cut and pasted into docstrings.
..... Click the link for more information.
QuickBASIC
Appeared in: 1985 - 1988
Designed by: Microsoft Corporation
Developer: Microsoft Corporation
Latest release: 4.5/ 1988
Influenced by: GW-BASIC
Influenced: Visual Basic
OS: MS-DOS, Mac OS
License: MS-EULA
Website: www.microsoft.
..... Click the link for more information.