Subject: File No. S7-08-10
From: Derek M Jones
Affiliation: Visiting Professor at Kingston University, London

April 21, 2010

I welcome the requirement that source code be released.

Computer language issues

It is rare for the specification of a computer language to fully enumerate the behavior of every language construct under all conditions. Possible reasons for this state of affairs are discussed in: "Forms of language specification: Examples from commonly used computer languages" down-loadable from: http://www.knosof.co.uk/vulnerabilities/langconform.pdf

The Python language is specified by a method known as a reference implementation freely available from http://wiki.python.org/moin/CPython

Achieving the proposal's aims of making program source code available requires that, for a given set of input data and program execution options, execution always produce the same output on all computing platforms used. One method for significantly reducing the probability of a program producing different output on different platforms is to use the same implementation of the computer language on every platform.

Multiple implementations of Python are available, see http://wiki.python.org/moin/implementation, and there is no way of guaranteeing that a program executed by different implementations will always produce the same output.

The computer language usage proposals contained in S7-08-10 need to specify that the program source code be accompanied by information on the computer language implementation used to execute the program and how this implementation can be obtained.

Interpreted vs Compiled languages

In common usage referring to a particular language as being an interpreted language means that most implementations of it are interpreted (i.e., compiled implementations may exist). Similarly referring to a language as a compiled language means that most implementation of it are compiled (i.e., interpreted versions of it may exist).

The distinction between interpreted and compiled is not always clear cut. Some implementations of interpreted languages choose to compile the source to a form intermediate between source and machine code and to then interpret this intermediate form. Some implementation techniques (e.g., threaded-code is often used to implement the Forth language) are difficult to classify as either interpreted or compiled.

One property of interpreted languages is ease of building implementations that execute programs to produce the same output when executed on different computing platforms.

Choice of computer language

Interpreted languages have the consistency of execution across platforms advantaged mentioned above.

There is no compelling reason for the SEC to select Python over any other language.

Everybody has their own favorite computer language and there is no sign of convergence to use of a single language, see http://shape-of-code.coding-guidelines.com/2009/06/101/

New computer languages are invented all the time, including in the financial world (e.g., the A+ language from Morgan Stanly, see: http://en.wikipedia.org/wiki/A+_(programming_language)).

Some issues that should be addressed when creating the permission criteria used for permitting source code to be written in a particular computer language include:

1. Are implementations of the language sufficiently widely used that any problems with their correctness are likely to have been detected (and presumably fixed)?

2. Is the language specification stable? A constantly changing specification increases the likelihood that an incorrect version of an implementation is used, leading to different program output or that people reading the source derive incorrect conclusions from making use of language knowledge that is inconsistent with that used by the author of the code.

3. The ease with which an implementation can be obtained and used by a third party.