If you open a file in UNIX and get a file descriptor back, that file descriptor is a token that designates an object - but it isn’t a capability. To be a capability, we need to open that file with some permission - some access right. For example:
intfd=open("/etc/passwd",O_RDWR);
Now we have a token and a set of rights.
In Pony, every reference has both a type and a reference capability. In fact, the reference capability is part of its type. These allow you to specify which of your objects can be shared with other actors and allow the compiler to check that what you’re doing is concurrency safe.
There are a few simple concepts you need to understand before reference capabilities will make any sense. We’ve talked about some of these already, and some may already be obvious to you, but it’s worth recapping here.
The problem with concurrency is shared mutable data. If two different threads have access to the same piece of data then they might try to update it at the same time. At best this can lead to the two threads having different versions of the data. At worst the updates can interact badly resulting in the data being overwritten with garbage. The standard way to avoid these problems is to use locks to prevent data updates from happening at the same time. This causes big performance hits and is very difficult to get right, so it causes lots of bugs.
Any data that is immutable (i.e. it cannot be changed) is safe to use concurrently. Since it is immutable it is never updated and it’s the updates that cause concurrency problems.
If a block of data has only one reference to it then we call it isolated. Since there is only one reference to it, isolated data cannot be shared by multiple threads, so there are no concurrency problems. Isolated data can be passed between multiple threads. As long as only one of them has a reference to it at a time then the data is still safe from concurrency problems.
An isolated piece of data may be a single byte. But it can also be a large data structure with multiple references between the various objects in that structure. What matters for the data to be isolated is that there is only a single reference to that structure as a whole. We talk about the isolation boundary of a data structure. For the structure to be isolated:
There must only be a single reference outside the boundary that points to an object inside.
There can be any number of references inside the boundary, but none of them must point to an object outside.
The code within a single actor is never run concurrently. This means that, within a single actor, data updates cannot cause problems. It’s only when we want to share data between actors that we have problems.
OK, safely sharing data concurrently is tricky. How do reference capabilities help?
By sharing only immutable data and exchanging only isolated data we can have safe concurrent programs without locks. The problem is that it’s very difficult to do that correctly. If you accidentally hang on to a reference to some isolated data you’ve handed over or change something you’ve shared as immutable then everything goes wrong. What you need is for the compiler to force you to live up to your promises. Pony reference capabilities allow the compiler to do just that.
If you’ve used C/C++, you may be familiar with const, which is a type qualifier that tells the compiler not to allow the programmer to mutate something.
A reference capability is a form of type qualifier and provides a lot more guarantees than const does!
In Pony, every use of a type has a reference capability. These capabilities apply to variables, rather than to the type as a whole. In other words, when you define a class Wombat, you don’t pick a reference capability for all instances of the class. Instead, Wombat variables each have their own reference capability.
As an example, in some languages, you have to define a type that represents a mutable String and another type that represents an immutable String. For example, in Java, there is a String and a StringBuilder. In Pony, you can define a single class String and have some variables that are String ref (which are mutable) and other variables that are String val (which are immutable).
There are six reference capabilities in Pony and they all have strict definitions and rules on how they can be used. We’ll get to all of that later, but for now here are their names and what you use them for:
Isolated, written iso. This is for references to isolated data structures. If you have an iso variable then you know that there are no other variables that can access that data. So you can change it however you like and give it to another actor.
Value, written val. This is for references to immutable data structures. If you have a val variable then you know that no-one can change the data. So you can read it and share it with other actors.
Reference, written ref. This is for references to mutable data structures that are not isolated, in other words, “normal” data. If you have a ref variable then you can read and write the data however you like and you can have multiple variables that can access the same data. But you can’t share it with other actors.
Box. This is for references to data that is read-only to you. That data might be immutable and shared with other actors or there may be other variables using it in your actor that can change the data. Either way, the box variable can be used to safely read the data. This may sound a little pointless, but it allows you to write code that can work for both val and ref variables, as long as it doesn’t write to the object.
Transition, written trn. This is used for data structures that you want to write to, while also holding read-only (box) variables for them. You can also convert the trn variable to a val variable later if you wish, which stops anyone from changing the data and allows it be shared with other actors.
Tag. This is for references used only for identification. You cannot read or write data using a tag variable. But you can store and compare tags to check object identity and share tag variables with other actors.
Note that if you have a variable referring to an actor then you can send messages to that actor regardless of what reference capability that variable has.
A reference capability comes at the end of a type. So, for example:
Stringiso// An isolated stringStringtrn// A transition stringStringref// A string referenceStringval// A string valueStringbox// A string boxStringtag// A string tag
What does it mean when a type doesn’t specify a reference capability? It means you are using the default reference capability for that type, which is defined along with the type. Here’s an example from the standard library:
classvalString
When we use a String we usually mean an immutable string value, so we make val the default reference capability for String (but not necessarily for String constructors, see below). For example, when we don’t specify the capability in the following code, the compiler understands that we are using the default reference capability val specified in the type definition:
leta:Stringval="Hello, world!"letb:String="I'm a wombat!"// Also a String val
So do I have to specify a reference capability when I define a type? Only if you want to. There are sensible defaults that most types will use. These are ref for classes, val for primitives (i.e. immutable references), and tag for actors.
How to create objects with different capabilities¶
When you write a constructor, by default, that constructor will either create a new object with ref or tag as the capability. In the case of actors, the constructor will always create a tag. For classes, it defaults to ref but you can create with other capabilities. Let’s take a look at an example:
classFooletx:U32newvalcreate(x':U32)=>x=x'
Now when you call Foo.create(1), you’ll get a Foo val instead of Foo ref.
But what if you want to create both val and refFoos? You could do something like this:
But, that’s probably not what you’d really want to do. Better to use the capabilities recovery facilities of Pony that we’ll cover later in the Recovering Capabilities section.