The Life of a Programmer

Search

What is Structural Typing?

Structural typing is a concept in coding where we primarily consider the shape of a type, its properties and methods, instead of the name of the type. This contrasts the more understood nominative typing, where the name of the type is of utmost importance.

In this article, I want to introduce the basics of structural typing, starting from a neutral viewpoint — explaining it without a particular language in mind. I will compare to interfaces, which are not structural types. Then, I will look at Python to compare duck-typing to structural typing, and how this relates to its newer nominative type system. I’ll also look at C++, to show the inferred structural typing of generics, and how constraints are the explicit form. Finally, I’ll look briefly at TypeScript, as it’s perhaps the most common structurally typed language.

Nominative typing is the familiar kind we see in Java, C++, C#, typed Python, among others. Most coders probably understand nominative typing. Otherwise they’re likely working dynamically typed languages, like JavaScript, or untyped Python. I’ll touch briefly on how that differs from structural typing in the “duck typing” section.

A Type

Let’s briefly consider what a type is: it’s something with a name, and a set of properties and methods associated with it. Most of us are familiar with classes, and that’s perhaps a decent generic way of looking at a type. Here’s an example of a user, in pseudo-code:

class User:
	String name
	String email

The name of this type is User, the most important part of it in a nominative type system. The structure of this type, the important part in a structural type system, is the collection of properties and methods — everything except the name. This example doesn’t include a method, for simplicity, and to exemplify a common structural typing use-case.

Nominative vs. Structural arguments

To understand the difference between nominative and structural types, we can look at the case of calling a function.

function send_login_reminder( User recipient ):
	...

tom = load_user("tom")
send_login_reminder(tom)

When the compiler, or interpreter, reaches the send_login_reminder line, it checks if the type of the argument matches the required type specified in the function declaration. Here it checks if tom matches the type User. How this matching works is significantly different in the different type systems.

In nominative typing, the argument has to be the User type. If we have inheritance, this could also be any type that inherits from User. But that’s it. Nothing else can match that type.

In structural typing, only the form of the type is checked. It isn’t relevant that it’s a User. For example, we can call this function with an anonymous object instead.

send_reminder({
	name = "Yara"
	email = "yara@mail.com"	
})

This fails in nominative typing, since it isn’t the User type. In structural typing, the type-checker sees it looks the same as a User, so it’s an appropriate match. The object has a name and email, just like the properties of User.

Aliases

In this view, the names of types in structural types are effectively only aliases. The send_login_reminder function could be equally written, in pseudo-code, like this:

function send_login_reminder( recipient as {
	String name
	String email
})

The symbol User is nowhere to be found, yet the function is semantically identical.

To dive deeper into types, take a look at my article how compilers do type conversion.

Type Plurality

With nominative types, every object has a specific type, or set of types, that identifies it. When we create a User object, that object is actually of the type User, and nothing else. Perhaps that User inherits from a base-class, in which case it is also of that type. But that’s it. It’s not any of the other types that we define in our code-base. In this way, we can speak of the object as having a particular type, and checks like instance of make sense. Nominative types have real identities.

With structural types, objects don’t have a strict set of types that identify it. Any object matches any type, across all our code and all libraries, that has a structurally compatible form. For example, if I declare this type:

class HasName:
	String name

Any object which has a String name field, regardless of how it was created, or which types were involved in its creation, will now also match the HasName type. This means that objects don’t have a limited set of types, and can instead match many types in the code base. From this view, we can’t speak of objects as having a particular type, and checks like instance of are nonsensical — structural types don’t have identities.

Inclusive, excess properties

Structural type systems also tend not to care about excess properties. For example, perhaps we have a Company type as below.

class Company:
	String name
	String email
	Address address
	Region region

We can call the send_login_reminder function with objects of this type as well.

company = load_company("highland")
send_login_reminder(company)

Whether this makes sense in our code depends on whether or not companies can login. This may or may not be a defect that a structural type system would not catch.

In contrast, if we had a send_catalog function, we probably want it to work for User and Company types, which, with no type changes, works fine in structural typing. In nominative typing, we’d have to introduce an interface, or shared base class, that exposes the common name and email.

Variables

I’ve used function argument matching as a type-matching example. The same rules apply when assigning to a variable: the right-side is checked if it matches the type on the left-side.

jane = load_user('jane')
User b = jane

In the second line, the type checker will check if the value to be assigned matches the explicitly declared type of the variable. Does jane match the type of User? As before, in structural typing, the name User is irrelevant, and only the form of the type is considered. So the following is allowed.

User b = {
	name = "Xiomara"
	email = "x473@mail.com"
}

The form of the value on the right matches the form on the left, thus it is allowed.

Interfaces

Interfaces, as defined in C++, Java and C#, as well as Python’s abstract base classes, are not the same as structural types. These types of interfaces are still nominative types: a function that takes an interface type as an argument, requires the type of that argument implements that interface.

Thinking back to before, we wanted a send_catalog function that works with both User and Company types. In structural types, we said that just works, since they have the right fields. But for nominative typing, we’d need to explicitly say the types are related. We typically do this with an interface.

interface EmailRecipient:
	String name
	String email

class User implements EmailRecipient:
	...
	
class Company implements EmailRecipient:
	...


function send_login_reminder( User recipient ) // accepts only User

function send_catalog( EmailRecipeint recipient ) // accepts User and Company

With nominative types, the send_catalog function requires its argument’s type implement the EmailRecipient interface. The below, for example, would not be valid:

class LooksLikeRecipient:
	String name
	String email
	
kenny = new LooksLikeRecipient( name = 'Kenny', email = 'k273@mail.com' )
send_catalog(kenny) // fails, kenny is not an EmailRecipient

It doesn’t matter that LooksLikeRecipient has the same form as the EmailRecipient interface. That’s not what is important in a nominative type. The type has to explicitly say it implements the EmailRecipient interface, whereas in a structural type, simply having the name and email field is enough. In a structurally typed language, the call to send_catalog( kenny ) is valid.

Python

Python was originally a dynamically typed language: there was no static type checking to see if arguments or assignments match. Python 3.5 introduced type hints, which then became the basis of Python’s static type system.

Python’s dynamic type system was referred to as duck-typing, which is a kind of implied structural type system — though with some complications. The introduced type hints, however, added a nominative type system.

Duck Typing

Duck typing is the quintessential aspect of dynamic typing. Effectively there are no static types, and whether code is valid depends solely on whether the statements executed at runtime make sense for the values involved. Despite this dynamic nature, we can still talk about the types a function takes, albeit in a less strict way.

For example, consider our familiar function and rough implementation:

def send_login_reminder( recipient ):
	message_raw = load_reminder_template
	message = message_raw.format( recipient.name )
	send_email( recipient.email, message )

Looking at the body of this function, we can tell that it works so long as the recipient argument has both a name and email property. Despite it being dynamic, we can talk about the expected type of the recipient argument — being explicit or inferred tends not to change the fundamental way in which a type system works. For example, if we were a type checker, we would infer the following:

interface _send_login_reminder_argument_0:
	String name
	String email
	
def send_login_reminder( _send_remind_email_argument_0 recipient ):
	...

This looks and behaves much like the User type works in a structural type system. To see how it differs, we need to introduce a conditional branch.

def send_login_reminder( recipient ):
	message_raw = load_reminder_template
	message = message_raw.format( recipient.name )
	if recipient.email.endswith( ".special.com" ):
		send_special_email( recipient.email, recipient.token, message )
	else:
		send_email( recipient.email, recipient.priority, message )

What would the type of recipient have to look like now?

interface _send_login_reminder_argument_0:
	String name
	String email
	possibly Integer token
	possibly Integer priority

The possibly pseudo-code is not the same as optional field, since in the code there is a path that definitely needs the token and one that definitely needs the priority, but not both. We can still express this as a structural type, but perhaps not the one you’re expecting. I’m shortening the names here for clarity.

interface _base:
	String name
	String email
	
interface _variant_0 extends _base
	Integer token
	
interface _variant_1 extends _base
	Integer priority
	
interface argument_0 = _variant_0 or _variant_1

In this code or means a union type, where the type can be one or the other. We can create structural types that define the inferred type of a duck-typed function. But the resulting type can become quite complex, with every branch introducing another union of similar types. They aren’t that useful as types. This is why, when migrating from duck-typing to structural typing, we’d probably opt for simple optional types for token and priority.

Protocols

Python adopted a nominative type system in 3.5. You’re still allowed to use duck-typing, but you can also create statically typed functions and variables. And despite duck-typing being close to structural typing, they instead went with nominative typing. There are many reasons for this, and I’ll get to it in a future article.

Typed Python retains the ability to do structured types in the form of a Protocol. This is a special symbol which makes a class behave like a structural type. Consider the following example, with the nominative types first.

@dataclass
class User:
	name: str
	email: str
	
def send_login_reminder( recipient: User ):
	...
	
	
jack = load_user('jack')
send_login_reminder(jack)  // good

lilou = {
	'name': 'Lilou',
	'email': 'lil8493@mail.com',
}
send_login_reminder(lilou)  // not allowed

The recipient argument of the function must take an explicit User type. Neither a tuple nor another class with the same fields will work. It has to be an actual User object. But we can do something similar to the EmailRecipient used as an example before, using the Protocol class.

class EmailRecipient(Protocol):
	name: str
	email: str

def send_remind_email( recipient: EmailRecipient ):
	...

Now, although the User class doesn’t mention EmailRecipient at all in its declaration, it still matches, since the form is the same. Protocol creates structural types. PEP 544 calls this “Structural subtyping” or “static duck typing”. Though, as mentioned before, the structural types you define by hand will be stricter than the inferred structural types arising from duck typing.

C++

C++ has structural types hiding behind the template keyword. They were only made explicit in C++20 with the introduction of constraints and concepts.

Templates

In C++ templates allow defining a function where the types differ depending on how it is called. As a simple example, perhaps we have a function that compares the ages of two objects.

template<typename ItemT>
bool is_younger(ItemT a, ItemtT b) {
    return a.age < b.age;
}

This can be called with two objects of the same type, as long as they have an age defined, and that age can be compared. The age may be an integer, or it could be a date object, anything that defines a less-than operator. Arguments a and b have inferred types; in pseudo-code we could replace ItemT with an explicit structural type.

interface LessThanT:
	bool operator < (LessThanT a, LessThanT b)
	
interface ItemT:
	LessThanT age

Now, this isn’t how C++ works. It doesn’t infer this type. Instead, it replaces all the types used at the point where the template is instantiated, and ensures they’re valid. Similar to how Python also didn’t infer its duck-types, instead it simply runs the code sees if an error occurs.

Constraints and Concepts

The trouble with templates in C++ is that if you had a type error within template code, the error message could be long, hard to read, and hidden deep within a call-tree of templates. The same type of error happens with duck-typing in Python, where deep in some library code you get a type-mismatch.

For years, the C++ community has debated the idea of explicit structural types in the form of concepts and constraints. Unfortunately, I have had not the chance to work with C++ since they’ve been introduced, so I’m going to defer to a reference page.

To cope with C++’s static types, the concept system is stricter and more flexible than Python’s — of course at the typical C++ expense of added syntax.

TypeScript

TypeScript is primarily a structurally typed language. Its type primitives, type and interface, both define structural types — but the enum type defines a nominative type. The generic examples I’ve given about structural types work in TypeScript, with an exception about excess parameters.

Enums and fundamental types

Enums are different in TypeScript. Since they are effectively strings, and structural typing would treat all strings the same, it’d make for relatively useless enum types. TypeScript broke from their structural typing to say that enum is nominatively typed. An enum of one type, will not match an enum of another type, even if they are both strings, even if they share the same keys and values.

The structural nature of fundamental types is harder to explain, and the reason I’ve stuck with class types for this article. For example, take a simple “integer” type. What is the name, and what is the form of this type? Here, the term “integer” refers to both the name and its form, which can make it confusing.

Let’s consider instead an alias, using the TypeScript number type.

type AgeT = number

function is_younger(a: AgeT, b: AgeT) { ... }

AgeT is another name for number. As structural types don’t care about names, this shouldn’t be relevant. The is_younger function can be called with any numbers, and it could be declared with number instead of AgeT. The type AgeT is an interchangeable alias with number.

Some languages offer strong aliases, where, in this example, a number would not be passable to an AgeT type: you’d have to explicitly convert. Python has a natural way to do this with TypeVar.

This is how enum differs in TypeScript. Two structurally identical enums, even with the same keys and values, are distinct. They are nominatively typed, not structurally.

enum CallResponse {
	no = 0
	yes = 1
}
enum OtherResponse {
	no = 0
	yes = 1
}

function Accept( r: CallResponse ) { ... }

Accept( CallResponse.no )  // valid
Accept( OtherResponse.no ) // invalid

Knowing this, with some trickery, you can mimic nominative types in TypeScript using enums. It’s possible to create a distinct type for AgeT like Python allows, though it’s not pretty.

Excess Properties

Having structural types as the basis to an entire language’s static type system has some issues. I’m going to do a followup about these problems in TypeScript, but for now, let’s look at one common case: excess properties.

We said before that structural types care only if the required properties are present, not if there are too many of them. Since this leads to a common class of errors, TypeScript disallows excess properties in places it can easily detect them. For example, think back to our send_login_reminder, here converted to TypeScript.

type User = {
	name: String
	email: String
}

function send_login_reminder( recipient: User ) { ... }

send_login_reminder({ 
	name: 'Maxine',
	email: 'max4775@mail.com',
	token: 123,
})

This call to send_login_reminder is invalid, as TypeScript notices the extra token property and realizes it will not be used. As TypeScript lacks named arguments, as JavaScript does, using argument types to pass named arguments is common. This extra check in TypeScript ensures you only pass arguments the function knows about.

This check is limited and was previously lost if you assigned the arguments to a variable. For example, passing a Company object to something expecting a User object would work. The excess properties check is a narrow exclusion to the structural typing rules. as const was further introduced to try to retain this property check over several statements.

Using Structural Types

Several languages have structural typing features, as well as some languages being primarily structurally typed. I know of other languages that support structural typing, such as Ocaml, Go and Haxe, but I’ve not used them to know specifically if they work the same way.

The key difference between structural and nominative types is whether the identity, the name, has any meaning of not. In nominative types, the name is the key defining characteristic, and objects can only match a type if they are explicitly using that name (either directly or via inheritance). Whereas in structural types, the name is irrelevant, and only the form (the properties and methods) of the objects are relevant.

Please join me on Discord to discuss, or ping me on Mastadon.

What is Structural Typing?

An overview of types defined by their form instead of their name

A Harmony of People. Code That Runs the World. And the Individual Behind the Keyboard.

Mailing List

Signup to my mailing list to get notified of each article I publish.

Recent Posts

Search