sml

You are currently browsing articles tagged sml.

I compiled some programming language popularity statistics in April 2009, October 2009 and October 2010 . Here’s an update for September 2011:

I made a number of Google searches of the forms below and summed the results (previous posts averaged the results):

"implemented in <language>"
  "written in <language>"

Naturally this is of very limited utility, and the numbers are only useful when comparing relatively within the same search since the number of results Google returns can vary greatly over time.

Language Total Prev. Position Position Delta
C 10,360,000 2 1
PHP 10,351,000 1 -1
C++ 6,495,000 3 0
Python 5,759,000 5 1
C# 5,335,000 4 -1
 
Java 4,890,000 8 2
Perl 3,702,000 6 -1
JavaScript 3,077,000 7 -1
Ruby 1,654,000 9 0
Lisp Family1 1,022,870 11 1
 
FORTRAN 975,600 10 -1
Tcl 594,500 12 0
Lisp 486,000 14 1
Haskell 450,500 16 2
Erlang 419,700 13 -2
 
Lua 367,100 18 2
ML Family2 348,400 17 0
COBOL 308,270 15 -3
Common Lisp 254,900 19 0
OCaml 240,300 21 1
 
Prolog 224,000 20 -1
Scala 203,400 23 1
Scheme 184,700 22 -1
Smalltalk 129,700 24 0
Clojure 84,600 27 2
 
(S)ML3 83,630 25 -1
Forth 69,980 26 -1
Caml 24,470 28 0
Io 17,700 30 1
Arc 12,670 29 -1

1 combines Lisp, Scheme, Common Lisp, Arc & Clojure
2 combines OCaml, (S)ML, Caml
3 summed separate searches for sml and ml

Tags: , , , , , , , , , , , ,

I compiled some programming language popularity statistics in April 2009 and October 2009 . Here’s an update for October 2010:

I made a number of Google searches of the forms below and averaged the results:

"implemented in <language>"
  "written in <language>"

Naturally this is of very limited utility, and the numbers are only useful when comparing relatively within one column since the number of results Google returns can vary greatly over time.

Language Apr 2009 Oct 2009 Oct 2010 Position Delta
PHP 680,000 5,083,500 14,096,000 +3
C 1,905,500 16,975,000 9,675,000 -1
C++ 699,000 6,270,000 6,510,000 -1
C# 349,700 2,125,000 5,132,000 +4
Python 396,000 3,407,000 5,114,500 +1
Perl 365,500 3,132,500 4,675,000 +1
JavaScript 102,700 1,163,000 2,120,000 +4
Java 850,000 5,118,000 1,495,500 -5
Ruby 99,650 227,000 1,426,000 +13
FORTRAN 1,621,000 770,850 0
Lisp Family1 176,507 3,489,650 399,685 -6
Tcl 44,800 382,000 313,400 +5
Erlang 22,285 161,700 188,800 +12
Lisp 61,900 486,500 174,050 +1
COBOL 247,300 166,435 +6
Haskell 22,550 280,500 157,150 +4
ML Family2 29,062 1,003,800 149,005 -5
Lua 13,065 131,800 128,150 +9
Common Lisp 20,600 554,500 112,750 -5
Prolog 17,750 390,500 100,000 -4
OCaml 22,000 343,500 99,050 -3
Scheme 86,450 2,100,000 82,650 -13
Scala 3,570 66,250 65,950 +6
Smalltalk 9,105 187,500 56,950 0
(S)ML3 5,173 590,700 42,130 -12
Forth 6,465 146,450 25,880 0
Clojure 782 62,200 23,525 +3
Caml 1,889 69,600 7,825 0
Arc 6,775 286,500 6,710 -10
Io 1,760 198,500 3,025 -7

1 combines Lisp, Scheme, Common Lisp, Arc & Clojure
2 combines OCaml, (S)ML, Caml
3 summed separate searches for sml and ml

Tags: , , , , , , , , , , , ,

I compiled some programming language popularity statistics in April and mentioned I’d update the results in 6 months, so here they are:

I made a number of Google searches of the forms below and averaged the results:

"implemented in <language>"
"written in <language>"

Language # Results
Apr 09
# Results
Oct 09
Position
Delta
C 1,905,500 16,975,000 0
C++ 699,000 6,270,000 +1
Java 850,000 5,118,000 -1
PHP 680,000 5,083,500 0
Lisp Family1 176,507 3,489,650 +3
Python 396,000 3,407,000 -1
Perl 365,500 3,132,500 -1
C# 349,700 2,125,000 -1
Scheme 86,450 2,100,000 +2
FORTRAN 1,621,000 N/A
JavaScript 102,700 1,163,000 -1
ML Family2 29,062 1,003,800 +3
(S)ML3 5,173 590,700 +12
Common Lisp 20,600 554,500 +5
Lisp 61,900 486,500 -2
Prolog 17,750 390,500 +4
Tcl 44,800 382,000 -3
OCaml 22,000 343,500 0
Arc 6,775 286,500 +4
Haskell 22,550 280,500 -4
COBOL 247,300 N/A
Ruby 99,650 227,000 -10
Io 1,760 198,500 +6
Smalltalk 9,105 187,500 -1
Erlang 22,285 161,700 -7
Forth 6,465 146,450 -1
Lua 13,065 131,800 -5
Caml 1,889 69,600 0
Scala 3,570 66,250 -2
Clojure 782 62,200 0

1 combines Lisp, Scheme, Common Lisp, Arc & Clojure
2 combines OCaml, (S)ML, Caml
3 summed separate searches for sml and ml

Tags: , , , , , , , , , , , , , , , , , , , ,

Table of Contents

Chapter 6 – Case Analysis

Tuple types are homogeneous e.g. all values of type int*real are pairs containing an int and a real. Match failures occur at compile time. Types that have more than one form, such as int, are heterogeneous. Pattern matches fail at run time as a bind failure.

ML defines functions over heterogeneous types using clausal function expressions. For example:

fn pat1 => exp1
 | pat2 => exp2
 | ...
 | patn => expn

Each component pat => exp is called a clause, or a rule. The entire assembly of rules is a called a match. For example:

val recip : int -> int =
    fn 0 => 0 | n:int => 1 div n

The fun notation is generalized so we can be more concise:

fun recip 0 = 0
  | recip (n:int) = 1 div n

Case analysis on the values of a heterogeneous type is performed by application of a clausally-defined function. The notation:

case exp
  of pat1 => exp1
   | ...
   | patn => expn

is short for the application:

(fn pat1 => exp1
  | ...
  | patn => expn)
exp

Matches are subject to two forms of “sanity check” – exhaustiveness checking and redundancy checking.

Chapter 7 – Recursive Functions

For a function to be able to call itself, it needs a name. For example:

val rec factorial : int->int =
    fn 0 => 1 | n:int => n * factorial (n-1)

or using fun notation:

fun factorial 0 = 1
  | factorial (n:int) = n * factorial (n-1)

Iteration

If we define a helper function that accepts an accumulator we can reduce the storage needed:

fun helper (0,r:int) = r
  | helper (n:int,r:int) = helper (n-1,n*r)
fun factorial (n:int) = helper (n,1)

It’s better programming style to hide the helper function w/in a local declaration:

local
    fun helper (0,r:int) = r
      | helper (n:int,r:int) = helper (n-1,n*r)
in
    fun factorial (n:int) = helper (n,1)
end

Tail recursive functions are analogous to loops in imperative languages – they iterate the computation w/o needing auxiliary storage.

Mutual Recursion

Definitions which are mutually recursive can be joined together with the and keyword to indicate they are defined simultaneously by mutual recursion:

fun even 0 = true
  | even n = odd (n-1)
and odd 0 = false
  | odd n = even (n-1)

Chapter 8 – Type Inference and Polymorphism

Standard ML allows you to omit type information whenever it can be determined from context. Consider the following:

fn s:string => s ^ "\n"

There is no need to declare the type of s since it can be inferred from the context, so we may write:

fn s => s ^ "\n"

A type scheme is a type expression involving one or more type variables standing for an unknown, but arbitrary type expression. Type variables are written ‘a (pronounced alpha), ‘b (pronounced beta), etc. For example, the type scheme ‘a->’a has instances int->int, string->string, (int*int)->(int*int), and (’b->’b)->(’b->’b), etc. It does not have the type int->string.

We may bind an identity function to the variable I as follows:

val I : 'a->'a = fn x=>x

We may also write:

fun I(x:'a) : 'a = x

Standard ML eliminates the need to ascribe a type scheme to the variable:

val I = fn x=>x

or

fun I(x) = x

Chapter 9 – Programming with Lists

The values of type type list are the finite lists of values of type type:

1. nil is a value of type typ list.
2. if h is a value of type typ, and t is a value of type typ list,
   then h::t is a value of type typ list.
3. Nothing else is a value of type typ list.

The type expression typ list is a postfix notation for the application of the type constructor list to the type typ.

A value val of type typ list has the form:
val1 :: (val2 :: (… :: (valn :: nil) … ))

The :: operator is right-associative, so we may omit parentheses:
val1 :: val2 :: … :: valn :: nil

Or, we may use list notation:
[ val1, val2, ..., valn ]

Computing With Lists

Some examples:

fun length nil = 0
  | length (_::t) = 1 + length t

Note we do not give a name to the head of the list, instead we use a wildcard _

fun append (nil, l) = l
  | append (h::t, l) = h :: append (t, l)

The latter is built into Standard ML and is written using infix as: exp1 @ exp2

fun rev nil = nil
  | rev (h::t) = rev t @ [h]

The running time of the latter is O(n2). The following definition makes use of an accumulator and has a running time of O(n):

local
    fun helper (nil, a) = a
      | helper (h::t, a) = helper (t, h::a)
in
    fun rev' l = helper (l, nil)
end

Chapter 10 – Concrete Data Types

Non-Recursive Datatypes

Example of nullary i.e. zero argument, constructors:

datatype suit = Spades | Hearts | Diamonds | Clubs

It is conventional to capitalize the names of value constructors, but this is not required by the language.

Datatypes may be parameterized by a type:

datatype 'a option = NONE | SOME of 'a

The values are NONE or Some val, where val is a value of type typ. The option type constructor is pre-defined in Standard ML.

Option types can also be used in aggregate data structures:

type entry = { name:string, spouse string option }

An entry for an unmarried person would have a spouse field with a value of NONE.

Recursive Datatypes

datatype 'a tree =
  Empty |
  Node of 'a tree * 'a * 'a tree

1. The empty tree Empty is a binary tree.
2. If tree_1 and tree_2 are binary trees, and val is a value of type type, then Node (tree_1, val, tree_2) is a binary tree.
3. Nothing else is a binary tree.
A function to compute the height of a binary tree, and one to compute the number of nodes:

fun height Empty = 0
  | height (Node (lft, _, rht)) = 1 + max (height lft, height rht)

fun size Empty = 0
  | size (Node (lft, _, rht)) = 1 + size lft + size rht

Heterogeneous Data Structures

The tree data type above requires that the type of the data items at the nodes must be the same for every node of the tree. To represent a heterogeneous tree, the data item must be labelled with enough info to determine the type at run-time.

datatype int_or_string =
  Int of int |
  String of string

type int_or_string =
  int_or_string tree

Datatype declarations and pattern matching can be useful for manipulating the abstract syntax of a language. Consider an example representing arithmetic expressions:

datatype expr =
  Numeral of int |
  Plus of expr * expr |
  Times of expr * expr

fun eval (Numeral n) = Numeral n
  | eval (Plus (e1, e2)) =
    let
        val Numeral n1 = eval e1
        val Numeral n2 = eval e2
    in
        Numeral (n1+n2)
    end
  | eval (Times (e1, e2)) =
    let
        val Numeral n1 = eval e1
        val Numeral n2 = eval e2
    in
        Numeral (n1*n2)
    end

If we extend the expr datatype as follows:

datatype expr =
  Numeral of int |
  Plus of expr * expr |
  Times of expr * expr
  Recip of expr

The compiler will complain about eval being incompatible with the new version of expr. Recompiling eval will produce an inexhaustive match warning since eval lacks a case for Recip. This is one of the benefits of static typing provided in Standard ML.

Tags: ,

Table of Contents

Chapter 4 – Functions

Lambda expressions are written as:

fn var : typ => exp

For example:

fn x : real => Math.sqrt (Math.sqrt x)
(fn x : real => Math.sqrt (Math.sqrt x)) (16.0)
val fourthroot : real -> real =
  fn x : real => Math.sqrt (Math.sqrt x)
fourthroot 16.0

ML provides a special syntax for function bindings that’s more concise:

fun fourthroot (x:real):real = Math.sqrt (Math.sqrt x)

By experimenting (I expect this is covered later), I found the following is sufficient due to type inference:

fun fourthroot x = Math.sqrt (Math.sqrt x)

Local val bindings may shadow parameters and other val bindings. For example, in the following, the last occurrence of x refers to the parameter of h, but the preceding two occurrences of x refer to the local binding with a value of 2.0:

fun h(x:real):real =
  let val x:real = 2.0 in x+x end * x

Chapter 5 – Products and Records

The n-tuple is the simplest form of aggregate data structure. They are of the form:
(val1, … ,valn)
An n-tuple is a value of a product type of the form:
typ1* … *typn
For example:

val pair : int * int = (2, 3)
val triple : int * real * string = (2, 2.0, "2")
val pair_of_pairs : (int * int) * (real * real) = ((2,3),(2.0,3.0))

A 0-tuple, also known as a null tuple, is the empty sequence of values, ( ). It is a value of type unit. 1-tuples are absent from the language.

Tuple Patterns

We can use pattern matching to extract portions of an n-tuple. For example, in the code snippet below, r will be equal to 3.14. Underscores indicate “don’t care” positions:

val foo : (int * string) * (real * char) = ((7,"hello"),(3.14,#"z"))
val ((_, _), (r:real, _)) = foo

We can give names to the first and second components of the pair – using the REPL:

- val (is:int*string,rc:real*char) = foo;
val is = (7,"hello") : int * string
val rc = (3.14,#"z") : real * char

A pattern is one of three forms:

  1. A variable pattern of the form var : typ
  2. A tuple pattern of the form (pat1, …, patn), where each pati is a pattern. This includes as a special case the null-tuple pattern, ()
  3. A wildcard pattern of the form _

Record Types

Tuples can become more difficult to use as the number of elements increases. Record types allow labeling each component. A record type has the form:
{ lab1:typ1, …, labn:typn}
A record value has the form:
{ lab1=val1, …, labn=valn}
A record pattern has the form:
{ lab1=pat1, …, labn=patn}

For example, the record type hyperlink is defined as follows:

type hyperlink =
  { protocol : string,
    address  : string,
    display  : string }

The following record binding defines a variable of type hyperlink:

val mailto : hyperlink =
  { protocol = "mailto",
    address = "foo@bar.com",
    display = "Brian Adkins" }

The following record binding:

val { protocol=prot, display=disp, address=addr } = mailto

decomposes into the three variable bindings:

val prot = "mailto"
val addr = "foo@bar.com"
val disp = "Brian Adkins"

We can use wildcard to extract selected fields:

val {protocol=prot, address=_, display=_ } = mailto

However, this isn’t very helpful with many fields, so we can use ellipsis patterns:

val {protocol=prot, ... } = mailto

ML provides an abbreviated form of record pattern {lab1,…labn} which stands for { lab1=var1, …, labn=varn} where the variables have the same name as the corresponding labels. For example, the following:

val { protocol, address, display } = mailto

decomposes into these bindings:

val protocol = "mailto"
val address = "foo@bar.com"
val display = "Brian Adkins"

Multiple Arguments and Multiple Results

fun dist (x:real, y:real):real = sqrt (x*x + y*y)

Keyword parameters are supported through record patterns:

fun dist’ {x=x:real, y=y:real} = sqrt (x*x + y*y)

Invoked as follows:

dist' {x=2.0,y=3.0}

Functions with multiple results may be thought of as functions yield tuples (or records).

fun dist2 (x:real, y:real):real*real
  = (sqrt (x*x+y*y), abs(x-y))

Sharp notation allows us to conveniently access the Nth item in a tuple.

- val foo = (3,7,5,2);
val foo = (3,7,5,2) : int * int * int * int
- #3 foo;
val it = 5 : int

A similar notation is used for record field selection:

 - val foo = { name = "Brian", phone = "555-1212" };
val foo = {name="Brian",phone="555-1212"} : {name:string, phone:string}
- #name foo;
val it = "Brian" : string

However, Harper states, “Use of the sharp notation is strongly discouraged!

Tags:

Table of Contents

The posts in this series will be notes & highlights from working through “Programming in Standard ML” by Robert Harper. See Getting Started with Standard ML for a link to the PDF. I may not always use quotes in this series, but it should be assumed that any factual information about Standard ML that I write in this series has been obtained from the PDF above – this series is, in effect, a summarization of Robert Harper’s PDF.

Overview

Attributes of Standard ML:

  • Type-safe programming language
  • Statically typed
  • Extensible type system
  • Polymorphic type inference
  • Automatic storage management
  • Encourages functional (effect-free) programming
  • Allows imperative (effect-ful) programming where appropriate
  • Pattern matching
  • Extensible exception mechanism for handling error conditions
  • Expressive and flexible module system
  • Portable due to its precise definition
  • Portable standard basis library

Chapter 1 – Programming in Standard ML

This chapter provides a brief overview of the language by considering an implementation of a regular expression package. It was slightly more familiar to me because of my brief studies of Haskell; otherwise, I think it would’ve appeared very foreign. I prefer more of a bottom up approach as much as possible instead of looking at language features that haven’t been described.

Currying

I do think that currying & partial application are cool features. Consider the following function declaration:

val match : RegExp.regexp -> string -> bool

Coming from a non-currying background, I would describe match as a function that accepts a regular expression and a string, and returns a boolean value indicating whether the string matches the regular expression, but Robert describes it this way:

It contains a function match that, when given a regular expression, returns a function that determines whether or not a given string matches that expression.

This was one of the features that made a strong impression on me when I first started looking at functional programming languages.

Chapter 2 – Types, Values, and Effects

Computation in ML is of a somewhat different nature. The emphasis in ML is on computation by evaluation of expressions, rather than execution of commands.

An expressions has three characteristics:

  • It may or may not have a type
  • It may or may not have a value
  • It may or may not create an effect

Effects will be ignored until chapter 13, so until then, it’s assumed that all expressions are effect-free, or pure. A type is defined by specifying three things:

  • a name for the type,
  • the values of the type, and
  • the operations that may be performed on values of the type

Notes:

  • Negative numbers, as literals, are indicated with a tilde instead of a hypen e.g. ~7
  • div is used for integers, / for reals

A typing assertion has the form: exp : typ For example: 3 : int 4 div 3 : int ML does not perform any implicit conversions between types. Use real(3) + 4.3, or 3 + round(4.3) Conditional expression (exp1 and exp2 must have the same type): if exp then exp1 else exp2 if not exp then exp1 else exp2

Chapter 3 – Declarations

Once a variable is bound to a value, it is bound to it for life; there is no possibility of changing the binding of a variable once it has been bound. Examples of type bindings:

type float = real
type count = int and average = real

Examples of value bindings:

val m : int = 3+2
val pi : real = 3.14 and e : real = 2.17

One thing to keep in mind is that binding is not assignment. The binding of a variable never changes; once bound to a value, it is always bound to that value (within the scope of the binding). However, we may shadow a binding by introducing a second binding for a variable within the scope of the first binding.

Limiting scope with let expressions:

let
  val m : int = 3
  val n : int = m*m
in
  m*n
end

The following expression evaluates to 54 because the binding of m is temporarily overridden during the evaluation of the let expression, then restored upon completion of this evaluation:

val m : int = 2
val r : int =
    let
        val m : int = 3
        val n : int = m*m
    in
        m*n
    end * m

Tags:

One of the two parallel tracks in my 2009 Programming Language Plan begins with the Standard ML programming language, so it’s time to get started.

Standard ML Resources

Compilers

Books

Since I was unfamiliar with the Standard ML programming language, I was surprised to find there are a number of good books about the language. Following are just some of them:

Elements of ML Programming

ML for the Working Programmer

Purely Functional Data Structures

The Definition of Standard ML

The Standard ML Basis Library

Introduction to Programming using SML

Concurrent Programming in ML

Other Educational Materials

Programming in Standard ML – excellent online book by Robert Harper of Carnegie Mellon University. Since I don’t know if Standard ML will simply be a stepping stone to Haskell (which in turn may not be a primary language for me) or a language I invest a lot of time in, I’m going to restrict myself from my normal method of purchasing a book or two when learning a new language. Instead, I’ll be going through Harper’s online book initially.

Tips for Computer Scientists on Standard ML (Revised)

Hello World

There are a number of great compilers for Standard ML (listed above), but I only need one to get started, so I chose Standard ML of New Jersey despite the funky name. It’s a popular version, and it has a REPL, so it’s good enough for me for now.

I develop software on Mac OSX and deploy on Ubuntu Linux. On my Ubuntu server, installing SML/NJ was as simple as:

sudo apt-get install smlnj

On Mac OSX, there are a couple of options listed on this page. I could use a pre-built system or the generic Unix install, so naturally I chose the generic Unix install which installed easily according to the simple directions.

# Download config.tgz
tar xzf config.tgz
config/install.sh

# Wait for install to complete

~/software/smlnj$ rlwrap bin/sml
Standard ML of New Jersey v110.69 [built: Sat May  2 12:04:08 2009]
- print "hello, world\n";
hello, world
val it = () : unit
-

Great, looks like everything is working fine. Note, I use the rlwrap utility to provide a nicer REPL experience, but it’s not required.

I’ll continue with a series of posts with notes from working through “Programming in Standard ML”.

Tags: ,

Despite the numerous ways in existence to quantify programming language popularity, I thought I’d throw yet another one into the mix. I made a number of Google searches of the forms below and averaged the results:

"implemented in <language>"
"written in <language>"

I’m very curious to see how these stats change over time, so I’ve added a calendar item to recompute them in six months. Leave a comment if you’d like to add a programming language to the list, and I’ll update this article and it will be included in the recomputation six months from now.

Language # Results
C 1,905,500
Java 850,000
C++ 699,000
PHP 680,000
Python 396,000
Perl 365,500
C# 349,700
Lisp Family1 176,507
JavaScript 102,700
Ruby 99,650
Scheme 86,450
Lisp 61,900
Tcl 44,800
ML Family2 29,062
Haskell 22,550
Erlang 22,285
OCaml 22,000
Common Lisp 20,600
Prolog 17,750
Lua 13,065
Smalltalk 9,105
Arc 6,775
Forth 6,465
(S)ML3 5,173
Scala 3,570
Caml 1,889
Io 1,760
Clojure 782

1 combines Lisp, Scheme, Common Lisp, Arc & Clojure
2 combines OCaml, (S)ML, Caml
3 summed separate searches for sml and ml
Update 4/23/09 added C#, Tcl per comment requests.

Tags: , , , , , , , , , , , , , , , , , , , ,