Structural typing is a concept in coding where we primarily consider the shape of a type, its properties and methods, instead of the name of the type. This contrasts the more understood nominative typing, where the name of the type is of utmost importance.
In this article, I want to introduce the basics of structural typing, starting from a neutral viewpoint — explaining it without a particular language in mind. I will compare to interfaces, which are not structural types. Then, I will look at Python to compare duck-typing to structural typing, and how this relates to its newer nominative type system. I’ll also look at C++, to show the inferred structural typing of generics, and how constraints are the explicit form. Finally, I’ll look briefly at TypeScript, as it’s perhaps the most common structurally typed language.
Nominative typing is the familiar kind we see in Java, C++, C#, typed Python, among others. Most coders probably understand nominative typing. Otherwise they’re likely working dynamically typed languages, like JavaScript, or untyped Python. I’ll touch briefly on how that differs from structural typing in the “duck typing” section.
A Type
Let’s briefly consider what a type is: it’s something with a name, and a set of properties and methods associated with it. Most of us are familiar with classes, and that’s perhaps a decent generic way of looking at a type. Here’s an example of a user, in pseudo-code:
class User: String name String email
The name of this type is User
, the most important part of it in a nominative type system. The structure of this type, the important part in a structural type system, is the collection of properties and methods — everything except the name. This example doesn’t include a method, for simplicity, and to exemplify a common structural typing use-case.
Nominative vs. Structural arguments
To understand the difference between nominative and structural types, we can look at the case of calling a function.
function send_login_reminder( User recipient ): ... tom = load_user("tom") send_login_reminder(tom)
When the compiler, or interpreter, reaches the send_login_reminder
line, it checks if the type of the argument matches the required type specified in the function declaration. Here it checks if tom
matches the type User
. How this matching works is significantly different in the different type systems.
In nominative typing, the argument has to be the User
type. If we have inheritance, this could also be any type that inherits from User
. But that’s it. Nothing else can match that type.
In structural typing, only the form of the type is checked. It isn’t relevant that it’s a User
. For example, we can call this function with an anonymous object instead.
send_reminder({ name = "Yara" email = "yara@mail.com" })
This fails in nominative typing, since it isn’t the User
type. In structural typing, the type-checker sees it looks the same as a User
, so it’s an appropriate match. The object has a name
and email
, just like the properties of User
.
Aliases
In this view, the names of types in structural types are effectively only aliases. The send_login_reminder
function could be equally written, in pseudo-code, like this:
function send_login_reminder( recipient as { String name String email })
The symbol User
is nowhere to be found, yet the function is semantically identical.
To dive deeper into types, take a look at my article how compilers do type conversion.
Type Plurality
With nominative types, every object has a specific type, or set of types, that identifies it. When we create a User
object, that object is actually of the type User
, and nothing else. Perhaps that User
inherits from a base-class, in which case it is also of that type. But that’s it. It’s not any of the other types that we define in our code-base. In this way, we can speak of the object as having a particular type, and checks like instance of
make sense. Nominative types have real identities.
With structural types, objects don’t have a strict set of types that identify it. Any object matches any type, across all our code and all libraries, that has a structurally compatible form. For example, if I declare this type:
class HasName: String name
Any object which has a String name
field, regardless of how it was created, or which types were involved in its creation, will now also match the HasName
type. This means that objects don’t have a limited set of types, and can instead match many types in the code base. From this view, we can’t speak of objects as having a particular type, and checks like instance of
are nonsensical — structural types don’t have identities.
Inclusive, excess properties
Structural type systems also tend not to care about excess properties. For example, perhaps we have a Company
type as below.
class Company: String name String email Address address Region region
We can call the send_login_reminder
function with objects of this type as well.
company = load_company("highland") send_login_reminder(company)
Whether this makes sense in our code depends on whether or not companies can login. This may or may not be a defect that a structural type system would not catch.
In contrast, if we had a send_catalog
function, we probably want it to work for User
and Company
types, which, with no type changes, works fine in structural typing. In nominative typing, we’d have to introduce an interface, or shared base class, that exposes the common name
and email
.
Variables
I’ve used function argument matching as a type-matching example. The same rules apply when assigning to a variable: the right-side is checked if it matches the type on the left-side.
jane = load_user('jane') User b = jane
In the second line, the type checker will check if the value to be assigned matches the explicitly declared type of the variable. Does jane
match the type of User
? As before, in structural typing, the name User
is irrelevant, and only the form of the type is considered. So the following is allowed.
User b = { name = "Xiomara" email = "x473@mail.com" }
The form of the value on the right matches the form on the left, thus it is allowed.
Interfaces
Interfaces, as defined in C++, Java and C#, as well as Python’s abstract base classes, are not the same as structural types. These types of interfaces are still nominative types: a function that takes an interface type as an argument, requires the type of that argument implements that interface.
Thinking back to before, we wanted a send_catalog
function that works with both User
and Company
types. In structural types, we said that just works, since they have the right fields. But for nominative typing, we’d need to explicitly say the types are related. We typically do this with an interface.
interface EmailRecipient: String name String email class User implements EmailRecipient: ... class Company implements EmailRecipient: ... function send_login_reminder( User recipient ) // accepts only User function send_catalog( EmailRecipeint recipient ) // accepts User and Company
With nominative types, the send_catalog
function requires its argument’s type implement the EmailRecipient
interface. The below, for example, would not be valid:
class LooksLikeRecipient: String name String email kenny = new LooksLikeRecipient( name = 'Kenny', email = 'k273@mail.com' ) send_catalog(kenny) // fails, kenny is not an EmailRecipient
It doesn’t matter that LooksLikeRecipient
has the same form as the EmailRecipient
interface. That’s not what is important in a nominative type. The type has to explicitly say it implements the EmailRecipient
interface, whereas in a structural type, simply having the name
and email
field is enough. In a structurally typed language, the call to send_catalog( kenny )
is valid.
Python
Python was originally a dynamically typed language: there was no static type checking to see if arguments or assignments match. Python 3.5 introduced type hints, which then became the basis of Python’s static type system.
Python’s dynamic type system was referred to as duck-typing, which is a kind of implied structural type system — though with some complications. The introduced type hints, however, added a nominative type system.
Duck Typing
Duck typing is the quintessential aspect of dynamic typing. Effectively there are no static types, and whether code is valid depends solely on whether the statements executed at runtime make sense for the values involved. Despite this dynamic nature, we can still talk about the types a function takes, albeit in a less strict way.
For example, consider our familiar function and rough implementation:
def send_login_reminder( recipient ): message_raw = load_reminder_template message = message_raw.format( recipient.name ) send_email( recipient.email, message )
Looking at the body of this function, we can tell that it works so long as the recipient
argument has both a name
and email
property. Despite it being dynamic, we can talk about the expected type of the recipient
argument — being explicit or inferred tends not to change the fundamental way in which a type system works. For example, if we were a type checker, we would infer the following:
interface _send_login_reminder_argument_0: String name String email def send_login_reminder( _send_remind_email_argument_0 recipient ): ...
This looks and behaves much like the User
type works in a structural type system. To see how it differs, we need to introduce a conditional branch.
def send_login_reminder( recipient ): message_raw = load_reminder_template message = message_raw.format( recipient.name ) if recipient.email.endswith( ".special.com" ): send_special_email( recipient.email, recipient.token, message ) else: send_email( recipient.email, recipient.priority, message )
What would the type of recipient
have to look like now?
interface _send_login_reminder_argument_0: String name String email possibly Integer token possibly Integer priority
The possibly
pseudo-code is not the same as optional field, since in the code there is a path that definitely needs the token
and one that definitely needs the priority
, but not both. We can still express this as a structural type, but perhaps not the one you’re expecting. I’m shortening the names here for clarity.
interface _base: String name String email interface _variant_0 extends _base Integer token interface _variant_1 extends _base Integer priority interface argument_0 = _variant_0 or _variant_1
In this code or
means a union type, where the type can be one or the other. We can create structural types that define the inferred type of a duck-typed function. But the resulting type can become quite complex, with every branch introducing another union of similar types. They aren’t that useful as types. This is why, when migrating from duck-typing to structural typing, we’d probably opt for simple optional
types for token
and priority
.
Protocols
Python adopted a nominative type system in 3.5. You’re still allowed to use duck-typing, but you can also create statically typed functions and variables. And despite duck-typing being close to structural typing, they instead went with nominative typing. There are many reasons for this, and I’ll get to it in a future article.
Typed Python retains the ability to do structured types in the form of a Protocol
. This is a special symbol which makes a class behave like a structural type. Consider the following example, with the nominative types first.
@dataclass class User: name: str email: str def send_login_reminder( recipient: User ): ... jack = load_user('jack') send_login_reminder(jack) // good lilou = { 'name': 'Lilou', 'email': 'lil8493@mail.com', } send_login_reminder(lilou) // not allowed
The recipient
argument of the function must take an explicit User
type. Neither a tuple nor another class with the same fields will work. It has to be an actual User
object. But we can do something similar to the EmailRecipient
used as an example before, using the Protocol
class.
class EmailRecipient(Protocol): name: str email: str def send_remind_email( recipient: EmailRecipient ): ...
Now, although the User
class doesn’t mention EmailRecipient
at all in its declaration, it still matches, since the form is the same. Protocol
creates structural types. PEP 544 calls this “Structural subtyping” or “static duck typing”. Though, as mentioned before, the structural types you define by hand will be stricter than the inferred structural types arising from duck typing.
C++
C++ has structural types hiding behind the template
keyword. They were only made explicit in C++20 with the introduction of constraints and concepts.
Templates
In C++ templates allow defining a function where the types differ depending on how it is called. As a simple example, perhaps we have a function that compares the ages of two objects.
template<typename ItemT> bool is_younger(ItemT a, ItemtT b) { return a.age < b.age; }
This can be called with two objects of the same type, as long as they have an age
defined, and that age can be compared. The age may be an integer, or it could be a date object, anything that defines a less-than operator. Arguments a
and b
have inferred types; in pseudo-code we could replace ItemT with an explicit structural type.
interface LessThanT: bool operator < (LessThanT a, LessThanT b) interface ItemT: LessThanT age
Now, this isn’t how C++ works. It doesn’t infer this type. Instead, it replaces all the types used at the point where the template is instantiated, and ensures they’re valid. Similar to how Python also didn’t infer its duck-types, instead it simply runs the code sees if an error occurs.
Constraints and Concepts
The trouble with templates in C++ is that if you had a type error within template code, the error message could be long, hard to read, and hidden deep within a call-tree of templates. The same type of error happens with duck-typing in Python, where deep in some library code you get a type-mismatch.
For years, the C++ community has debated the idea of explicit structural types in the form of concepts and constraints. Unfortunately, I have had not the chance to work with C++ since they’ve been introduced, so I’m going to defer to a reference page.
To cope with C++’s static types, the concept system is stricter and more flexible than Python’s — of course at the typical C++ expense of added syntax.
TypeScript
TypeScript is primarily a structurally typed language. Its type primitives, type
and interface
, both define structural types — but the enum
type defines a nominative type. The generic examples I’ve given about structural types work in TypeScript, with an exception about excess parameters.
Enums and fundamental types
Enums are different in TypeScript. Since they are effectively strings, and structural typing would treat all strings the same, it’d make for relatively useless enum types. TypeScript broke from their structural typing to say that enum
is nominatively typed. An enum of one type, will not match an enum of another type, even if they are both strings, even if they share the same keys and values.
The structural nature of fundamental types is harder to explain, and the reason I’ve stuck with class types for this article. For example, take a simple “integer” type. What is the name, and what is the form of this type? Here, the term “integer” refers to both the name and its form, which can make it confusing.
Let’s consider instead an alias, using the TypeScript number
type.
type AgeT = number function is_younger(a: AgeT, b: AgeT) { ... }
AgeT
is another name for number
. As structural types don’t care about names, this shouldn’t be relevant. The is_younger
function can be called with any numbers, and it could be declared with number
instead of AgeT
. The type AgeT
is an interchangeable alias with number
.
Some languages offer strong aliases, where, in this example, a
number
would not be passable to anAgeT
type: you’d have to explicitly convert. Python has a natural way to do this with TypeVar.
This is how enum
differs in TypeScript. Two structurally identical enums, even with the same keys and values, are distinct. They are nominatively typed, not structurally.
enum CallResponse { no = 0 yes = 1 } enum OtherResponse { no = 0 yes = 1 } function Accept( r: CallResponse ) { ... } Accept( CallResponse.no ) // valid Accept( OtherResponse.no ) // invalid
Knowing this, with some trickery, you can mimic nominative types in TypeScript using enums. It’s possible to create a distinct type for
AgeT
like Python allows, though it’s not pretty.
Excess Properties
Having structural types as the basis to an entire language’s static type system has some issues. I’m going to do a followup about these problems in TypeScript, but for now, let’s look at one common case: excess properties.
We said before that structural types care only if the required properties are present, not if there are too many of them. Since this leads to a common class of errors, TypeScript disallows excess properties in places it can easily detect them. For example, think back to our send_login_reminder
, here converted to TypeScript.
type User = { name: String email: String } function send_login_reminder( recipient: User ) { ... } send_login_reminder({ name: 'Maxine', email: 'max4775@mail.com', token: 123, })
This call to send_login_reminder
is invalid, as TypeScript notices the extra token
property and realizes it will not be used. As TypeScript lacks named arguments, as JavaScript does, using argument types to pass named arguments is common. This extra check in TypeScript ensures you only pass arguments the function knows about.
This check is limited and was previously lost if you assigned the arguments to a variable. For example, passing a Company
object to something expecting a User
object would work. The excess properties check is a narrow exclusion to the structural typing rules. as const
was further introduced to try to retain this property check over several statements.
Using Structural Types
Several languages have structural typing features, as well as some languages being primarily structurally typed. I know of other languages that support structural typing, such as Ocaml, Go and Haxe, but I’ve not used them to know specifically if they work the same way.
The key difference between structural and nominative types is whether the identity, the name, has any meaning of not. In nominative types, the name is the key defining characteristic, and objects can only match a type if they are explicitly using that name (either directly or via inheritance). Whereas in structural types, the name is irrelevant, and only the form (the properties and methods) of the objects are relevant.