Ownership is Rust’s most unique feature, and it enables Rust to make memory safety guarantees without needing a garbage collector. Therefore, it’s important to understand how ownership works in Rust. In this chapter we’ll talk about ownership as well as several related features: borrowing, slices, and how Rust lays data out in memory.
What is Ownership
Rust’s central feature is ownership, although the feature is staightforward to explain, it has deep implications of the rest of the language.
All programs have to manage the way they use a computer’s memory while running. Some languages ahve garbage collecion that constantly looks for no longer used memory as the program runs. In other languages, the programmer must explicitly allocate and free the memory. Rust uses a third approach: memory is managed through a system of ownership with a set of rules that the compiler checks at compile time. No run-time costs are iccured for any of the ownership features.
Because ownership is a new concept for many programmers, it does take some time to get used to. The good news is that the more experienced you become with Rust and the rules of the ownership system, the more you’ll be able to naturally develop code that is safe and efficient.
Ownership Rules
First, let’s take a look at the ownership rules.
- Each Value in Rust has a variable that’s called its owner.
- There can only be one owner at a time.
- When the owner goes out of scope, the value will be dropped.
Variable Scope
A scope is the range within a program for which an item is valid.
1 | fn main () { |
The variable s
refers to a string literal, where the value of the string is hardcoded into the text of our program. The variable is valid from the point at which it’s declared until the end of the current scope.
1 | { // s is not valid here |
When
s
comes into scope it is validIt remains so until it goes out of scope
At this point, the relationship between scopes and the valid variables are valid is similar to other programming languages. Now we’ll build on top of this understanding by introducing the String
type.
The String Type
To illustrate the rules of ownership, we need a data type that is complex.
We’ll use String
as the example here and concentrate on the parts of String
that relate to ownership. These aspects also apply to other complex data types provided by the standard library and that you create.
We’ve already seen string literals, where a string value is hardcoded into our program. String literals are conveient, but they aren’t always suitable for every situation in which you want to use text. One reason is that they’re immutable. Another is that not every string value can be known when we write our code. For these situations, Rust has a second string type, String
. This type is allocated on the heap and as such is able to store an amount of text that is unknown to us at compile time. You can create a String
from a string literal using the from
function, like so:
1 | let s = String::from("hello") |
The double colon(::
) is an operator that allows us to namespace this particular from
function under the String
type rather than using some sort of name like string_from
.
This kind of string can be mutated:
1 | let mut s = String::from('Hello') |
So why can String
be mutated but literals cannot.
Memory and Allocation
In the case of a string literal, we know the contents at compile so the text is hardcoded directly in to the final executable, making string literals fast and efficient. But these properties only come from its immutability. Unfortunately, we can’t put a blob of memory into the binary for each piece of text whose size is unknown at compile time and whose size might change while running the program.
With the String
type, in order to support a mutable, growable piece of text, we need to allocate an amount of memory on the heap, unknown at compile time, to hold the contents. This means:
The memory must be requested from the operating system at runtime.
We need a way of returning this memory to the operating system when we’re done with our
String
That first part is done by us: when we call String::from
, its implementation requests the memory if needs. This is pretty much universal in programming languages.
However, the second part is different. In languages with a garbage collector(GC)
, the GC keeps track and cleans up memory that isn’t being used anymore, and we, as the programmer, don’t need to think about it. Without a GC, it’s the programmer’s responsibility to identify when memory is no longer being used and call code to explicitly return it, just as we did to request it. Doing this correctly has historically been a difficult programming problem. If we forget, we’ll waste memory. If we do it too early, we’ll have an invalid vairable. If we do it twice, that’s a bug too. We need to pair exactly one allocate
with exact one free
Rust takes a different path: the memory is automatically returned once the variable that owns it goes out of scope.
1 | { |
These is a natural point at which we can return the memory our String
needs to operating system: when s
goes out of scope. When a variables goes out of scope, Rust calls a special function for us. This function is called drop
, and it’s where the author of String
can put the code to return the memory. Rust calls drop
automatically at the closing }
.
Note: In C++, this pattern of deallocating resources at the end of an item’s lifetime is sometimes called Resource Acquisition Is Initialization(RAII). The
drop
function in Rust will be familiar to you if you’ve used RAII patterns.
This patterns has profound impact on the way Rust code is written. It may seem simple right now, but the behavior of code can be unexpected in more complicated situations when we want to have multiple variables use the data we’ve allocated ont the heap.
Ways Variables and Data Interact: Move
Multiple variables can interact with the same data in different ways in Rust. Let’s look at an example using an integer.
1 | let x = 5; |
Here we got two independent variables x
and y
, this is because integers are simple values with a known, fixed size, and two 5
are pushed onto the stack.
Now let’s look at the String
version.
1 | let s1 = String::from("Hello"); |
This looks very similar to the previous code, so we might assume that the way it works would be the same: that is, the second line would make a copy of the value in s1
and bind it to s2
. But this isn’t quite what happens.
To explain this more thoroughly, let’s look at what String
looks like under the covers in the Figure.
A String
is made up of three parts, shown on the left: a pointer to the memory that holds the contents of the string, a length, and a capacity. This group of data is stored on the stack. On the right is the memory on the heap that holds the contents.
The length is how much memory, in bytes, the contents of the String
is currently using. The capacity is the total amount of memory, in bytes, that the String
has received from the operating system. The difference between length and capacity matters, but not in this context, so far now, it’s fine to ignore the capacity.
When we assign s1
to s2
, the String
data is copied, meaning we copy the pointer, the length and the capacity that are on the stack. We do not copy the data on the heap that the pointer refers to. In other words, the data representation in memory looks like the figure following.
Eariler, we said that when a variable goes out of scope, Rust automatically calls the drop
function and cleans up the heap memory for the variable. But the fact is that both data pointers pointing to the same location. This is a problem: when s2
and s1
go out of scope, they will both try to free the same memory. This is known as a double free error. Freeing memory twice can lead to memory corruption, which can potentially lead to security vulnerabilities.
To ensure memory safety, there’s one more detail to what happens in this situation in Rust. Instead of trying to copy the allocated memory, Rust considers s1
to no longer be valid and therefore, Rust doesn’t need to free anything when s1
goes out of scope. Check out what happens when you try to use s1
after s2
is created.
1 | let s1 = String::from("hello"); |
You’ll get an error like this because Rust prevents you from using the invalidated reference:
1 | error[E0382]: use of moved value: `s1` |
This action in Rust, similar to shallow copy
, it copy pointer, length and capacity in stack, but Rust also invalidates the first variable.
We call it move
.
s1
has been invalidated.
That solves our problem, with only s2
valid, when it goes out of scope, it alone will freee the memory, and we’re done.
In addition, there’s a design choice that’s implied by this: Rust will never automatically create ‘deep’ copies of your data. Therefore, any automatic copying can be assumed to be inexpensive in terms of runtime performance.
Ways Variables and Data Interact: Clone
If we do want to deeply copy the heap data of the String
, not just the stack data, we can use a common method called clone
.
1 | let s1 = String::from("Hello"); |
This works just fine, the heap data does get copied.
When you see a call to clone
, you know that some arbitrary code is being executed and that code may be expensive. It’s a visual indicator that something different is going on.
Stack-Only Data: Copy
There’s another wrinkle we haven’t talked about yet. This code using integers.
1 | let x = 5; |
This code runs correctly because those types like integers that have a known size at compile time are stored entirely on the stack, so copied of the actual values are quick to make. That means there’s no reason we would want to prevent x
from being valid after we created the variable y
. In other words, there’s no difference between deep and shallow copying here, so calling clone
wouldn’t do anything differently from the usual shallow copying and we can leave it out.
Rust has a special annotation called the Copy
trait that we can place on types like integers that are stored on stack.
If a type has the Copy
trait, an older variable is still usable after the assignment. Rust won’t let us annotate a type with the Copy
trait if the type, or any of its parts, has implemented the Drop
trait. If the type needs something special to happen when the value goes out of scope and we add the Copy
annotation to that type, we’ll get a compile time error.
As a general rule, any group of simple scalar values can be Copy
, and nothing that requires allocation or is some form of resource is Copy
. Here are some of the types that are Copy
:
All the integer types
The boolean type
All the floating point types
Tuples, but only if they contain types a also
Copy
.
Ownership and Functions
The semantics for passing a value to a function are similar to assigning a value to a variable. Passing a variable to a function will move or copy, just like assignment.
1 | fn main () { |
If we tried to use s
after the call to take_ownership
, Rust would throw a compile time error. These static checks protect us from mistakes.
Return Values and Scope
Returning values can also transfer ownership.
1 | fn main () { |
It’s possible to return mutiple values using a tuple.
Reference and Borrowing
Here is how you would define and use a calculate_length
function that has a reference to an object as a parameter instead of taking ownership of the value.
1 | fn main () { |
First notice that all the tuple code in the variable declaration and the function return value is gone. Second, note that we pass &s1
into calculate_length
, and in its definition, we take &String
rather than String
.
These ampersands(&) are reference, and they allow you to refer to some value without taking ownership of it.
In this figure, &String s
pointing to String s1
Let’s take a closer look at the function call here:
1 | let s1 = String::from("Hello") |
The &s1
syntax lets us create a reference that refer to the value of s1
, but does not own it.
Because it does not own it, the value it points to will not be dropped when the reference goes out of scope.
Likewise, the signature of the function uses &
to indicate that the type of the paramter s
is a reference.
1 | fn calculate_length(s: &String) -> usize { // s is a reference to a String |
The scope in which the variable s
is valid is the same as any function parameter’s scope, but we don’t drop what the reference points to when it goes out of scope because we don’t have ownership.
Functions that have references as parameters instead of the actual values mean we won’t need to return the values in order to give back ownership, since we never had ownership.
We call having references as function paramters borrowing
. As in real life, if a person owns something, you can borrow it from them, When you’re done, you have to give it back.
If we try to modify something we’re borrowing, it won’t work.
Mutable References
1 | fn main () { |
First, we had to change s
to be mut
. Then we had to create mutable reference with &mut s
and accept a mutable reference with some_string: &mut String
.
But mutable references have one big restriction: you can only mutable reference to a particular piece of data in a particular scope.
This restriction allows for mutation but in a very controlled fashion. It’s something that new Rustaceans struggle with, because most languages let you mutate whenever you’d like. The benefit of having this restriction is that Rust can prevent data races at compile time.
A data race
is a particular type of race condition in which these three behaviors occur:
Two or more pointers access the same data at the same time.
At least one of the pointers is being used to write to the data.
There’s no mechanism being used to synchrnize access to the data.
Data races cause undefined behavior and can be difficult to diagnose and fix when you’re trying to track them down at runtime. Rust prevents this problem from happening because it won’t even compile code with data race.
As always, we can use curly brackets to create a new scope, allowing for multiple mutable references, just not simultaneous ones:
1 | let mut s = String::from("Hello"); |
A similar rule exists for combining mutable and immutable references. This code results in an error:
1 | let mut s = String::from("Hello"); |
We also cannot have mutable reference while we have an immutable one.
Users of an immutable reference don’t expect the values to suddently change out from under them.
Dangling References
In languages with pointers, it’s easy to erroneously to create a dangling pointer, a pointer that references a location in memory that may have been given to someone else, by freeing some memory while preserving a pointer to that memory. In Rust, by contrast, the compiler guarantees the references will never be dangling references: if we have a reference to some data, the compiler will ensure that the data will not go out of scope before reference to the data does.
The Rules of References
At any given time, you can have
either
but not both ofOne mutable reference
Any number of immutable references
Referecences must always be valid.
Slices
Another data type that doesn’t have ownership is the slice
. Slices let you reference a configuous sequence of elements in a collection rather tahn the whole collection.
String Slice
A string slice is a reference to part of a String
, and looks like this:
1 | let s = String::from("Hello world"); |
This is similar to taking a reference to the whole String
, but with the extra [0..5]
but. Rather than a reference to the entire String
, it’s a reference to an internal position in the String
and the number of elements that it refers to.
We create slices with a range of [start_index..end_index]
, but the slice data structure acutally stores the starting position and the length of the slice.
So in the case of let world = &s[6..11];
, world
would be a slice that contains a pointer to the 6th byte of s
and a length of value of 5.
With Rust’s ..
range syntax, if you want to start at the first index(0), you can drop the value before the two periods. In other words
By the same token, if your slice includes the last byte of the String
, you can drop the trailing number.
You can also drop both values to take a slice of the entire string.
1 | fn first_word (s: &String) -> &str { |
String Literals Are Slices
1 | let s = "Hello, World"; |
The type of s
here is &str
: it’s a slice pointing to the specific point of the binary. This is also why string literals are immutable; &str
is an immutable reference.
String Slices as Parameters
Knowing that you can take slices of literals and String
s leads us to one more improvement on first_word
, and that’s its signature:
1 | fn first_word(s: &String) -> &str { |
If we have a string slice, we can pass that directly. If we have a String
, we can pass a slice of the entire String
. Defining a function to take a string slcie instead of a reference to a String makes our API more general and useful without losing any functionality.
Other Slices
String slices, as you might imagine, are specific to strings. But there’s a more general slice type, too.
1 | let a = [1, 2, 3, 4, 5]; |
This slice has the type &[i32]
. It works the same way as string slices do, by storing a reference to the first element and the length.