Types are an immensely important concept in programming language design and theory.
A type can be thought of as a category that a value (typically at run-time)
can fall into. For example, the number 12 would have type Int
, and its
corresponding value would be 12.
Languages can be categorized into at least four camps, based on how deeply, or on whether or not the language’s design embraces types. A language can be “statically typed” versus “dynamically checked”, or “strong” versus “weak”. Intuitively, such typing disciplines:
Statically typed languages, which can have strong or weak typing, distinguish two phases; compile time and runtime. That is, code is compiled (during which it is checked for correctness) once, and then it can be run multiple times.
Dynamically checked languages, which can also have strong or weak typing, don’t have such a distinction. Instead, code is parsed and run (often with runtime checks for correctness each time it’s run) all always at the same time.
A language is considered to have strong or weak typing depending on whether or not its correctness checks (either compile time or runtime) agree with what is actually held in memory.
For example, a strong statically typed language like Scala statically (at
compile time) ensures that a certain value with type Int
is correctly used
throughout the program, and that at runtime, nothing else other than an Int
can be held in that value’s memory location.
On the other hand, weak statically typed languages like C statically ensure
that a certain value with type Int
is correctly used throughout the program,
but at runtime, it can’t ensure that nothing other than an Int
will be
held in that value’s memory location. Pointer arithmetic makes this guarantee
impossible — a programmer can easily place a String
in that Int
’s memory
location, which can lead to runtime errors.
Dynamically checked languages can also have strong or weak typing. Languages
like Python are strongly dynamically checked because they check for
correctness at runtime, preventing one from calling a method defined on a
String
for example on that of an Int
. In addition, these languages provide
no way for the programmer to write to an arbitrary memory location. Assembly,
on the other hand, is weak with no typng — it’s dynamic in that there is
no static or runtime checking, and it allows users to directly manipulate
memory.
“A type system can be regarded as calculating a kind of static approximation to the run-time behaviors of the terms in a program”
— Benjamin Pierce, Types and Programming Languages, 2002
Strong static typing has many benefits,
In line with scalability, in his monograph entitled Typeful Programming, Luca Cardelli remarks,
“Types are essential for the ordered evolution of large software systems. Types provide a way of controlling evolution, by partially verifying programs at each stage.”
— Luca Cardelli, Typeful Programming, 1993
The most widely heralded benefit of static typechecking is that it enables the early detection of some programming errors. By catching errors at compile time, programmers are free from worrying about stumbling across certain errors in a running application.
In fact, many programmers happily remark that their programs “just work” after passing the typechecker. This is likely due to the fact that such a type system tends to catch simple errors, such as forgetting to convert a string to a numeric type before doing a numerical operation on it, as well as complex errors, such as neglecting a boundary condition in a complex case analysis. The common theme here is that all of these errors manifest themselves as inconsistencies at the level of types.
The degree to which programmers are satisfied, however, tends to depend on the expressiveness of the type system they’re using. This is important to note, because later on, we will see how Scala’s advanced type system makes it possible to retain expressivity meanwhile enabling users to make typechecking more precise.
However, static type systems aren’t without their drawbacks.
“Being static, type systems are necessarily also conservative: they can categorically prove the absence of some bad program behaviors, but they cannot prove their presence, and hence they must also sometimes reject programs that actually behave well at runtime.”
— Benjamin Pierce, Types and Programming Languages, 2002
Additionally, there are varieties of errors that can’t be detected by conventional type systems. For instance, divisions by zero, non-termination, or array bound violations. It is here that unit testing (orthogonal to static typechecking), or dependent types (typechecking that depends on values) can come into play to catch these errors.
Scala is strongly statically typed, but it additionally stands out amongst other statically typed languages as having a particularly advanced advanced type system.
A helpful way to think about Scala’s type system is to approach it as if types are boundaries that you erect for yourself to safeguard against wrong behavior. The more you understand and know how to use Scala’s type system, the more you can get the type system to do for you, that is, the more expressive you can make it, all meanwhile retaining the same level of protection[1].
We’ll start by covering most basic types that come predefined in Scala that all programmers should know. Afterwards, we’ll see how to define your own types in Scala. Finally, we’ll cover the more advanced constructs of Scala’s type system that allow programmers to
Every type defined in a Scala program is essentially made up of a combination of Scala’s basic types. A hierarchy of all of Scala’s predefined and most basic types is shown below.
figure here
Types such as existential types or refinement types are combinations of existing types; they define new types, but without giving them a name
Intuitively, an existential type is a type with some unknown parts in it.
Formally, existential types, or “existentially quantified types,” are the dual of so-called “universally quantified types,” or generic types.
To understand how, first consider the following generic (universal) type;
Wombit[T]
, parameterized on generic type parameter T
, is valid for all
types T
. An existential type is the dual; Wombit[T] forSome { type T }
means there exists some type T
for which Wombit[T]
can be constructed.
Thus, rather than Wombit[T]
being able to be constructed for any T
, as
is the case for universal types, an existential Wombit[T]
can be constructed
for some T
, not necessarily all.
Wombit[T] forSome { type T }
Importantly, an existential type includes references to abstract type members
that we know exist, but whose concrete values/types we don’t know. For
example, in the above, T
is a type we don’t know concretely, but that we
know exists.
Note that the above can be written in shorthand, Wombit[_]
, which desugars
to Wombit[T] forSome { type T }
.
In his book, Theories of Programming Languages, John C. Reynolds remarks of the utility of existential types,
“Existential types are useful in their own right. In particular, a program using an unimplemented abstract type can be realized by a function that accepts an argument, of existential type, that describes an implementation of the abstract type; then the program can be executed with different implementations by applying the function to different arguments”
— John C. Reynolds, Theories of Programming Languages, 1998
The key notion to hold on to here is that, existential types make it possible to leave some parts of your program unknown, while still being able to typecheck it with different implementations for those unknown parts.
Note the similarity here to the utility of abstract type members. Often, abstract type members are used to achieve the same purpose — to “hide” or otherwise avoid concretely specifying some types until later. As we will later see, existential types can actually be more naturally modeled using abstract type members.
Let’s first consider an example which seeks to use existential types to make it easier to write code that works for many different implementations of the same abstract class.
In this example, we seek to make a functional counter that can be incremented,
that can get the value in its underlying representation, and that can convert
the underlying representation to an Int
.
abstract class Counter[T] {
def inc: Counter[T]
def get: T
def convert(x: T): Int
}
class LongCounter(count: Long = 0L) extends Counter[Long] {
def inc: Counter[Long] = new LongCounter(count+1)
def get: Long = count
def convert(x: Long): Int = x.toInt
}
// want to define a function that can work on any type of counter
// we cannot express this as a Counter[Any] bc Counter[Int] is not a subtype of Counter[Any]
def fun(c: Counter[t] forSome { type t }): Int = c match {
case cnt: Counter[a] => cnt.convert(cnt.inc.inc.get)
}
scala> new LongCounter
res2: LongCounter = LongCounter@6c7fd3e4
scala> fun(res2)
res3: Int = 2
def fun(c: Counter[t] forSome { type t }): Int =
c.inc.inc.get
scala> new LongCounter
res0: LongCounter = LongCounter@a779ab4
scala> fun(res0)
res1: Int = 2
Abstract type members also make it possible to hide the internal type of an abstraction. For example,
trait Fruit {
type T
val weight: Int
val tooRipe: T => Boolean
}
class Farm {
val fruit = new ArrayBuffer[Fruit]
}
...
As compared to,
case class Fruit[T](val weight: Int, val tooRupe: T => Boolean)
class Farm {
val fruit = new ArrayBuffer[Fruit[T] forSome { type T }] val fruit = new ArrayBuffer[Fruit[T] forSome { type T }]
}
...
Both code examples behave the same. In fact, one might argue that the example using abstract type members over existential types is less error-prone. This arises due to the fact that the existential type is unnamed. Thus if we had wanted to reuse the existential type elsewhere to construct some other data structure, we would need to reorganize our code to use abstract types instead of existentials.
Given then, that abstract types can be used more naturally more often than existential types to achieve the same purpose, one might ask why they exist at all in Scala.
The answer is Java interoperability. Java constructs like wildcards (types such
as Person<?>
) and raw types (types such as List
which omit type
parameters) are modeled using existential types.
Build something out of this http://stackoverflow.com/questions/4415511/scala-type-programming-resources
The structure they adopt is quite useful!
Typeclasses are a popular language feature of Haskell. In Scala, one can simulate typeclasses, that is, typeclasses are a pattern rather than explicitly built into the language.
One reason why Scala’s type system is considered to be powerful.
On a high level, typeclasses allow retrofitting types with interfaces (even predefined types like Int).
Typeclasses are a language feature in Haskell.
In Scala, typeclasses are a type-based pattern based on implicits that is becoming more and more common.
To become familiar with the key terminology of typeclasses, it is helpful to think about typeclasses to be completely different from Java-style classes.
In the context of Scala, a typeclass is a generic trait. Example:
trait Ordering[T] {
def compare(x: T, y: T): Int // abstract
...
}
Concrete implementations of a typeclass are defined as implicits. Example:
implicit object intOrdering extends Ordering[Int] {
def compare(x: T, y: T): Int = x - y
}
These concrete implementations are called typeclass instances.
The fact that they are defined as implicits is a fundamental aspect of the typeclass pattern. It allows simultaneously expressing constraints and making suitable implementations available. Example:
/* @return a sorted sequence with the same elements as `s` */
def sort[T](s: Seq[T])(implicit ev: Ordering[T]): Seq[T] = ...
Calling the above sort
method only type-checks if an implicit value of type
Ordering[T]
can be found.
Moreover, within the body of sort
, the witness ev
(or “evidence
parameter”) provides a concrete implementation of the required typeclass
instance for type T
.
Since this pattern is so common, it can be abbreviated using a context bound
on the type parameter T
:
/* @return a sorted sequence with the same elements as `s` */
def sort[T: Ordering](s: Seq[T]): Seq[T] = ...
(Note that we wrote Ordering
without specifying a type argument; a context
(bound requires that the bound is a type constructor.)
How do we get access to the witness (the concrete implementation of
Ordering[T]
) in this case?
The method implicitly
defined in the Predef
singleton object gives us
access:
def sort[T: Ordering](s: Seq[T]): Seq[T] = {
val ordering = implicitly[Ordering[T]]
...
}
The implicitly
method looks like magic, but it has a very simple definition:
def implicitly[T](implicit e: T) = e
(Remember that you can always pass explicit type arguments to polymorphic methods!)
In a full-spectrum language, types can contain arbitrary terms. For example, in Coq, you can attach a predicate to a type, for example.
[1] Analogy from Josh Suereth’s excellent book, Scala in Depth.