Ideal Language

Should function arguments be reassignable or mutable?

Working on a defect in Leaf I had a question: should function arguments be reassignable within a function? Are they just like local variables, or should they be treated specially? It would solve my problem in the Leaf compiler, but I don’t like making decisions for technical convenience. What is the correct answer?

This is an open question and I’d love to hear your feedback. The article is details and my viewpoint, but I don’t reach a conclusion.

Imperative approach

In the languages rooted in imperative programming, like C, C++, Java, and C#, we can freely use function arguments as local variables.

int calc( int a, int b ) {
    a += b
    b += a
    return a
}

I admit that I often write code that does this. It’s convenient to reuse an existing variable rather than introducing a new one. I realize it’s also gotten me into trouble before. When a piece of code modifies the arguments, but code later in the function wasn’t expecting that.

int calc( float a, int b ) {
    float result = a;

    if (some_conditon_on(a)) {
        b /= 5;
        result += b;
    }

    if (some_condition_on(b)) {
        result = alt_calc(a,b);
    }

    return result;
}

Though contrived, it shows that a second section of the code in the function may be relying on unmodified arguments. It’s a subtle defect as it requires both conditionals to evaluate to true. Add in more branches that may or may not modify the arguments, and the problem intensifies.

In JavaScript, that situation is worse. If I modify a named argument, it also modifies the arguments array.

function hidden_arg( name ) {
    name = "weird"
    console.log(arguments[0])
}


hidden_arg("expected")

That writes weird, not expected.

Functional approach

If we look to a language like Haskell we see that reassigning variables, in general, is frowned upon (is it even possible?). It’s not something fundamental to a functional programming though, whether a function a reassigns an argument doesn’t affect the purity of that function.

A function could, however, modify the value of an argument, and that would certainly ruin the immutable requirement.

This got me to thinking that perhaps the requirement should go even further: arguments should also be read-only by default. Consider the below code, where the “values” name is not reassignable (C and C++ are of the few languages where this notation is even possible):

//this prevents reassigning the "values" pointer...
float calc( vector<float> * const values ) {
    values[0] = 1; //...but we can still modify the values
    ...
}

What if the default were also to make everything read-only? (This is the typical C++ syntax for how that is done)

float calc( vector<float> const & values ) {
    values[0] = 1; //error!
    ...
}

This function has a much safer signature. I can call it without worrying that my vector might be accidentally changed on me.

I guess it’s unavoidable for this discussion to get deeper into the difference between a name of a value.

The default, but not a hard requirement

I’m starting to think that non-reassignable and read-only should be the default. If I want a mutable argument, I can mark it.

float sort( vector<float> mutable & values )

For complex value types that makes a lot of sense. But for service types, like say a file or window handle, it would be inconvenient. At least in Leaf, I have a distinct service type, which could be mutable by default instead. I don’t like inconsistency, but sometimes it has to be sacrificed for convenience.

Another situation that gives me pause is argument sanitization. For example:

float calc( float a, float b ) {
    if (b < 0) {
        a = -a;
        b = -b;
    }

    ...
}

In this situation, we don’t want the remainder of the function to have access to the original arguments. They’re intentionally hidden. Cleaning arguments may not be common, but I do it often enough that I’d need to have a solution for it. Perhaps hiding the arguments by a local variable of the same name might work.

Your thoughts?

I’m undecided on what the correct solution is. Current languages and best practices don’t appear to give a definite answer yet. This makes it one of those engaging topics in language design.

It’s part of my adventure in writing Leaf. I’d be happy to hear your thoughts on the topic as well.

25 replies »

  1. What is the default behavior you are searching for?
    In the first case(reference by default) you override the point of a function- a customized modifier that interacts with specific part of the system by reducing the scope while performing some specific action.
    In the second case(value by default) you empower the function since you are forbidding a modification of parameters considered input but not scope.

    Given the two cases above I would suggest you go for a function output driven approach:
    1. if the function returns the input parameters should be forced into values.
    2. if the function is declared as void inputs should be allowed to be passed as reference.

    going further I think that you should enforce explicit declaration of the parameter regardless of its type(reference/value). This will allow you to lock the modification of input variables unless those are explicitly intended to be modified(in the function declaration).
    Optional arguments should be passed by value no matter what types they are.
    If you address the functional parameters over the underlying framework(arguments[0] for example) one must always get the original value since those should be held separately and changed once the function finishes(a flush operation). The latter will allow for easier management of the whole system especially in asynchronous cases where one needs to be aware of time driven value changes to ensure the correct operation of the whole application.

    All written above introduces some nasty problems since if you are to force an object reference by value you are required to get a deep copy of its content, or do some smart copy and reduce the available access points to those that are data-only. I would not recommend doing any of those two options- simply drop the functionality: object/class types are to be passed by reference only which means that they can only be accessed by void functions. The latter will lead to cleaner class interfaces since you will force the designers to manage class references over private variables and not weird scope worm-holes that require interface twists to bridge the data between instances.

    • My goal is something safe by default, but not too inefficient. I know those can be conflicting goals at times.

      Here I was more concerned about whether the function can use the parameters as local variables. Whether things are passed by reference or by value is of course a related issue. In Leaf, unless something is marked `shared` (or is a `service` type) it’s passed by value.

      I don’t have to force things to values though. I can mark them read-only. This is of course not exactly the same, as by value implies imutable as well, whereas read-only does not.

      Varying the defaults by return type might be an option. It may not work in my language though where a lot of types are inferred. This includes the return value. It’d introduce an ambiguity about that a parameter is while looking at a signature `( x ) -> { … }` without seeing the whole body of the function you could no longer make any inferences about `x`. It may not be a terrible problem though, since just having/nothaving a return value is something that is easy to see form the code (look for a `return`).

      I’m not keen on having classes and fundamental types handled differently. I think it’s a failure that languages like Java and C# have made. There are many situations where you need a custom type to behave like a value, and having by-reference by default makes it easy to introudce bugs when people forget to call `clone`.

      But I did mention I have a `service` type, and it does pass by reference by default (it’s automatically `shared`). I’m trying to specifically address that there are really two kinds of custom types. A `point`, `complex` or `vector4` should be indistinguishable from a fundamentall type, whereas `file`, `window`, `endpoint` behave nothing like values: they are wrappers to some global/OS resource.

  2. I’ll have to check the language in details. From what you write I assume you have a master types that set up the “platform” behavior and sub-types that provide concrete access point to the memory space along with allocation implementations.
    In java and c# everything is an object. The difference comes in the way memory is allocated from the interpreters, but this is a thing you can not avoid given the nature of the data-construct. A custom type(a class) needs to scale its memory footprint in a relatively unpredictable(at compile time) ways compared to an array for example.

    • I don’t have the memory model detailed much yet in docs (possibly because it’s in flux).

      I’m not talking about dynamic-size types here, if that’s what you mean. A type like a `point : { x : float; y : float }` has a fixed predictable size. It shouldn’t vary in any way from a `float` type. Even with a collection type, I’m not sure what would be unpredictable about the memory it allocates.

      I also don’t see why I might not want a `vector` that has value semantics by default. It’s realtively easy to catch defects where a value when you meant to have a reference: thing sjust won’t change. Whereas pasisng a reference when you wanted a value results in a content changes that you may not immediately detect.

      In Java/C# the fundamental types are treated quite differently from class types. C# has `struct`, but Java has no way to create a proper `point` type.

  3. I opt for safe choice by default (not reassignable, immutable, read-only) and change on demand. However from my experience the local reassigning is used only as initialization part — for example in C#, when the parameter is null I reassign it with the sane non-null value. So maybe (just a maybe) a parameter should always be fixed (assigned for good) except special init section in the function. It is pretty similar notion as with readonly members in some type — you cannot reassign them with the only exceptions being constructors.

    • Yes, the situation of cleaning up the input makes this a bit trickier. I’ve done this with convention before. By using underscored names, like `_arg0` and the first step assinging to `arg`. But I’d prefer a clear semantic structure in the language to not obscure meaning.

      I wonder what a special init section might look like?

    • I think you should put the memory allocation in the focus. The syntax can vary, but this type of questions are memory driven and not programming style driven.
      The simple types like point are predictable, the array types are also predictable but you have two cases- a collection from generic types(like int) or from complex structures(like classes). Depending on the way you want to handle the memory you may end either handling the base array type as collection of pointers(render it down to a generic type and allocate in compiler defined chunks) in the general case and then deal with the actual data behind the pointers in a separate memory spaces(go deep on the class data type expanding on the compile/run-time space assessment) or go for the other way around and keep the allocated instance in the base collection memory space(which requires you to make the generic types super heavy in regards of compilation checks and drives the memory allocation management on the edge manly because you are not supporting fragmentation which forces you to do some heavy duty run-time optimization that will possibly end as performance killer).

      Depending on the way you decide to go for the implementation, you open up for problems that go way beyond the clean syntax. The way I see it a programming language is a systematized memory management/access structure(which is the full scope of the original question you asked) so you can’t really skip on the back-end(given you are not reusing an existing one) in such cases.

  4. I like Rust’s solution, that *all* variables, including arguments, are immutable by default but you can put `mut` to allow mutating then.

    The understandability advantages of things that don’t change is true everywhere, not just on parameters.

    • I’ve been thinking about something like this for a while as well. I already have a `var` declaration which kind of implies it is mutable. It’d be really easy for me to introduce a similar `let` expression that indicates the variables are immutable. No added typing for most use-cases.

      These are both just short-cuts to expanded types. So the default for parameters would also be immutable, but you could also change it to be mutable.

      I think this makes sense, thanks.

  5. I think the ideal solution is immutable/non-reassignable by default *plus* allowed (and encouraged) shadowing.

    Shadowing bit is really important for use-cases like argument sanitation, and without it you often either have to invent some not-ideal names (and risk using the wrong name down the line) or to make variables mutable (and risk making code less clear than it needs to be).

    Curiously, some languages, like Kotlin and Haskell, *lint* against shadowed variables, but at least in my personal experience, shadowing prevents bugs, instead of introducing them.

    • I can understand why a linter might take issue with this practice, as it often hides some subtle mistakes. Yet sanitization and variable hiding is such a nice approach.

      Maybe I could take the approach the member functions do in C#, if you intend on hiding a function you must mark it with new. For variables this could be a `shadow` or `hide` keyword perhaps:

      defn pine = ( x ) -> {
      shadow x = expr_on(x)

      }

      This would satisfy linting concerns without removing the functionality.

  6. >as it often hides some subtle mistakes

    My point is that not using shadowing hides more mistakes, and I firmly believe such linters are wrong :-) Obviously, I can’t prove my hypothesis, because we just don’t have the necessary, so I only can appeal to authority: shadowing is just a special case of more general “minimize visibility” advise :)

    The solution with explicit shadowing marker looks nice though!

  7. I prefer immutable arguments by default. (Actually, given the choice, I would probably prefer everything to be immutable by default.) Also, I like your idea of requiring shadowing to be explicitly declared.

    D has both immutable and const. Are you familiar with the difference? Both are “deep”, meaning everything you access through a const pointer or reference will also be const.

    On a slightly different topic, I like to program in a style where the getters are public and setters are private. This allows an object to see (but not make changes to) the values contained inside objects of other classes. My name for this style is “transparent programming”, and I describe it as: “all mutation happens locally”. Unfortunately, I’m not aware of any language that gives you public getters and private setters by default.

    (Aside: Regarding JavaScript’s arguments[0]: I might expect the current behavior, and it certainly would not bother me. I’ve never worked much in JavaScript, but I have written a lot of Lua code, and the two have many similarities.]

    • I’m intending on distinguishing between read-only and true const in Leaf. I know a lot of `const` things in C++ denote only read-only. I also find the `final` keyword in Java to not be very helpful: preventing only reassignment but saying nothing of the value contained within.

      I also have a `literal` type in Leaf. It’s basically C++’s const-expr stuff: a literal is something that is known at compile-time.

      I’m going to think about having private setters by default. In a way it does make sense. I have tuple types, which are completely open (and also allow names). A `class` type should almost imply the “true” data is protected.

    • mortoray wrote: “I’m going to think about having private setters by default.”

      I should clarify. I said, “Unfortunately, I’m not aware of any language that gives you public getters and private setters by default.” But it is even worse than that. I’m not aware of any language that makes it *easy* to automate the creation of public getters and private setters. I’m not sure “by default” is the right setting, but it would be nice to have the option to *automate* the creation of such getters and setters, for example on a per-class or per-module basis.

      I’m not sure if templates or mixins in D could do it. I’m not sure if Nim macros could do it. I’m not sure if Rust macros could do it. (Kotlin? Swift?)

      In C++, I declare all (or almost all) of a class’s members to be const. (I think declaring an STL container to be const can prevent the creation of some of its default constructors, so container members may be non-const.) I then have an unconst function macro that I use to remove const-ness when I want to mutate a value. The biggest downside of this approach is that it is less safe (i.e. it is more subject to programmer errors) than if the system were built into the language.

      I have not run into unexpected behavior due to removing const-ness C++. If I ever do, there is a reasonably simple way to compensate by using some macros and compiling twice, with different macro settings. The first compilation is just to check for compile time const errors, the second compilation will generate a properly functioning executable.

      (Aside: I believe getters and setters are also be called “properties” in some languages.)

      If C++ had getters and setters (and C++ does not have them), then I might be able to automate the creation of public getters and private setters with Herb Sutter’s “Metaclass” proposal.

      https://herbsutter.com/2017/07/26/metaclasses-thoughts-on-generative-c/

      In a nutshell, Metaclasses will allow the programmer to programmatically augment and adjust class definitions at compile time. So with Metaclasses (plus properties), I could create my own metaclass called “transparent_class”, and then define a “transparent_class Foo {};”, in place of “class Foo {};” Cheers!

    • > that makes it *easy* to automate the creation of public getters and private setters.

      Well, C# has syntax for autoproperties that lets you do `public int Size { get; private set; }`

    • > > that makes it *easy* to automate the creation of public getters and private setters.

      > Well, C# has syntax for autoproperties that lets you do `public int Size { get; private set; }`

      That’s nifty. Does it work for references to other objects? Strings? (References to?) array/containers? (I’m not sure how composition works in C#, i.e. if it is like C++ or Java.)

    • > That’s nifty. Does it work for references to other objects? Strings? (References to?) array/containers? (I’m not sure how composition works in C#, i.e. if it is like C++ or Java.)

      Only insomuch as it’s visibility for the property. If you return a (always mutable) `List` from the public get, it’s mutable. Which leads to C# suggestions like “never return an array in a property”.

      Very few languages seem to have the transitive const that C++ does, and more should. Of course, only Rust seems to get it right, since if you have `void foo(int& x, int const &y)` in C++, that doesn’t mean that `y` can’t change.

    • > Very few languages seem to have the transitive const that C++ does, and more should. Of course, only Rust seems to get it right, since if you have `void foo(int& x, int const &y)` in C++, that doesn’t mean that `y` can’t change.

      C++’s const is only partially transitive. It is transitive across composition, but not across pointer or reference.

      Whereas in D, const and immutable are fully transitive across composition, pointer and reference. (I follow D’s development, but have never seriously used D.)

    • I’m planning to make extrinisic modifiers, like immutability and/or read-only, be transitive modifiers in Leaf. This is orthogonal to any by-value/reference semantics. Once `const` always `const`.

  8. My opinion (which I have used in dotLang): Pick the simplest choice with minimum exceptions: Everything is immutable. So the input to a function is always immutable (which means it cannot be altered or re-assigned).

    • yes it works for everything, but it is limited to the actual variable encapsulated in the class where the property is defined. I don’t to expose arrays that are managed internally using properties so I am not really sure if you can assign the result of the get to a local variable and play with it, but as long as you are using the property handle you are locked by the access modifier set at design time.

      In case of properties of classes I would say that it is a bad practice to expose references publicly since it effectively breaks encapsulation. If you need composition(regardless if class or array) you normally should not need to expose the composed reference. if you do need that however, I would suggest you to rethink the design and not enforce over-regulation at compile time.

  9. Related to the previous discussion of “public getters and private setters”…

    Today I learned that Eiffel (which first appeared in 1986) allows foreign classes to read, but not assign to, local fields. If you want a foreign class to assign a value to a local field, you need to explicitly create a public setter.

    https://en.wikipedia.org/wiki/Eiffel_programming_language#Scoping

    See also this video at 21m40s:
    https://channel9.msdn.com/blogs/charles/emmanuel-stapf-eiffel-and-contract-oriented-programming

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s