A collection of reminders and tips from the excellent Rust Book.
Rust is a statically typed language and must know the types of all variables at compile time.
A scalar type represents a single value (integer, float, boolean, numeric, etc.) whereas compound types can group multiple values (even of different types) into one type (tuples, arrays, etc.).
Rust is an expression based language.
Statements are instructions that perform some action and do not return a value.
let x = 8; //statement
Expressions evaluate to a resulting value.
{ let x = 4; //statement x * 2 //expression }
Expressions can be part of statements. Calling a function, or a macro, is an Expression.
Expressions do not end with a semicolon ;
If you add one, it becomes a Statement and will not return a value.
Function signatures must declare the type of every parameter.
Function bodies contain a series of statements and optionally end in an expression.
Functions can return a value. We don't name it in the signature, but we have to indicate it's type.
Functions return the final expression in the block. You can return early by using the return
keyword.
if
must be followed by a condition evaluating to a bool. Rust will not automatically try to convert non-Boolean types to a Boolean.
Using multiple else if
will execute only the first arm for which the condition is true and skip all others after.
if
can follow a let
declaring a variable to a certain value according to a certain condition (in which case, all possibilities must have the same type).
let number = if condition { 5 } else { 6 };
loop
will run indefinitely until you explicitly call it to stop using break
.
for
loops are concise and safe, especially when associated to methods like .iter()
or using a range :
let a = [1, 2, 3, 4]; //the following are the same : for b in a.iter() {} for b in (1..4) {}
Ownership is a unique property of Rust.
The scope is the range within a program for which an item is valid. Usually it is valid from where it is declared inside a {
until the next }
where it goes out of scope.
When an item goes out scope, rust automatically calls the drop
function and returns the memory it occupied to the Ressource Allocator.
By repeating the use of the let
keyword you can shadow a variable by re-using the same name.
In this case the value of the variable is the last one declared in the scope and the others are shadowed.
Shadowing also allows you to change the type of a variable.
Shadowing is different from marking a variable as mut
because it does not allow re-assigning the value without the keyword let
let x = 5; let x = x * 2; //the first value of x is used on the right here let foo = "bar"; let foo = foo.len(); //foo equals 3 now
If we declare a variable to be the copy of another, then the first value is moved into the second and is no longer valid afterwards.
let s1 = String::from("Hello"); let s2 = s1; //s1 is no longer valid
This is only true for data that is on the heap. Integers and other types whose size are known at compile time are stored on the stack and are still valid after a move.
For data on the heap, this ensures there will be no double free error when both variables try to free the same memory.
Rust never makes deep copies of data (e.g. copy the whole data on the heap) so any automatic copying can be considered inexpensive in terms of runtime memory. Chapter 4.1
The ampersand &
allows you to make references to data instead of moving it.
A reference does not own the data, so the data is not dropped when the reference goes out of scope.
As Variables, References can be made mutable, but are immutable by default.
A mutable reference &mut
can only point to a mutable variable. In that case, you can only have one mutable reference to a variable in a given scope.
You cannot also have a mutable reference in a scope where you already have immutable references.
The scope of a reference ends when it is last used, which can be long before the closing curly bracket }
.
str
is a string literal, immutable and stored on the stack. e.g. let word = “Hello”;
String
is a string stored on the heap that can be mutable and can be instantiated using the from
function e.g. let word = String::from(“Hello”);
&str
is a string slice, that is, a part of a string. It's full notation is &s[0..4]
using range notation. Chapter 4.3
String literals are string slices because they point to that particular part of the binary. String slices store a reference to the first element and a length. You can take a full slice by omitting a start and end point : &s[..]
Slices are also valid on other types like &i32
:
let a = [1, 2, 3, 4, 5]; let s = &a[1..3]; //s is [2, 3]
Structs are flexible objects to tie together pieces of data that are significant to each other.
A struct is made up of fields which are named. Fields can only be marked as mutable if the entire struct is mutable.
Structs are declared using the struct
keyword along with the name of the Struct and it's fields. Instances of the struct can then be created.
Fields can be accessed using dot notation : struct_instance.field_name
.
If we use an expression to return an instance of a struct in a build function, and if the names of the variables and the names of the parameters of the build function are the same we can use a shorthand notation to avoid repetetion. Doc: Chapter 5.1.
The syntax ..another_struct
allows us to conveniently instantiate a struct by using another one's fields.
Tuple structs can be used to conveniently name a tuple without naming each field as we would in a struct. Doc: Chapter 5.1.
You can add functionnalities, such as std::fmt::Debug
to a struct by adding the #[derive(Debug)]
annotation above the struct
declaration. This allows the use of {:?}
or {:#?}
(for pretty-print) in a println!
macro to display the struct for debugging purposes. Chapter 5.2
You can tie operations on a struct to the struct itself by defining methods inside an impl
(implementation) block. Doc : Chapter 5.3.
A method's first parameter is always self
and methods are called using the .
syntax. Methods can take ownership of self
, borrow &self
immutably, or borrow &mut self
mutably, just as they can any other parameter.
Associated functions can also be created in the impl
block and don't have to take self
as a first parameter. Associated functions are different from methods and are called using the ::
syntax.
Enums define a type by enumerating it's possible variants.
An enum can be one and only one of its variants.
Refer to Doc: Chapter 6.1 for the creation of enums.
You can put any kind of data inside an enum variant: strings, numeric types, structs, etc. and the different variants do not require to have the same type.
Just like structs, enums can have methods associated by using impl
blocks and the methods can be called using the .
notation.
A very common enum is the Option
enum defined by the standard library with it's variant Some
and None
enum Option<T> { Some(T), None, }
The Option
enum exists because Rust does not implement the null
value like other programming languages. Instead, if a value can either be null or non-null we use an Option<T>
that can hold the value of type T
or can hold nothing. This forces developers to avoid the common programming mistake of not handling cases when a value could be nothing. In order to use a value that is nothing, a Rustacean has to opt-in using an Option
enum.
In general, we use an Option<T>
when we want to handle all the cases. The match
expression is a control flow construct that does just this when used with enums: it will run different code depending on which variant of the enum it has, and that code can use the data inside the matching value.
Combining match
and enum
is useful in many situations: match
against an enum
, bind a variable to the data inside, and then execute code based on it.
Because match
is exhaustive, all possibilities must be described. The catchall _
can be used to catch all remaining possibilities that have not explicitly been handled and assign them to one common expression.
The if let
syntax is more reader-friendly when defining code that only executes on one variant of an enum. else
can be used with an if let
to define action for all the other variants if necessary.
Vectors allow you to store multiple values in a single data structure and have them next to each other in memory (on the heap). It can grow or shrink in size.
Vectors can only store values of the same type.
Like any variable, when the vector goes out of scope, all it's content is dropped.
let v = Vec::new(); //or let v = vec![1, 2, 3]; //using vec! macro and letting compiler infer type.
let mut v<i32> = Vec::new(); v.push(5); v.push(88); //The compiler can infer the i32 type from the values pushed in the vector so the <i32> is not an obligation
let third: &i32 = &v[2]; //reference to third element, count starts from 0 //or match v.get(2) { Some(third) => //code None => //code }
String
is a growable, mutable, owned, UTF-8 encoded string type. &str
refers to a borrowed string slice.
let data = "initial content"; let s = data.to_string(); //or let s = "initial content".to_string(); //or let s = String::from("initial content");
To concatenate two existing strings you can use the +
operator or the format!
macro.
Check out Doc: Chapter 8.2 for more in depth explanations of what happens and how to use it.
Note that Rust does not allow indexing of a string, e.g. the following will not compile :
let s = String::from("hello"); let h = s[0]; //Error !
The reason for this is the internal representation of a String
where a UTF-8 character could be more than 1 byte long. Doc: Chapter 8.2
Indexing can still be used with caution using an explicit range &[0..2]
but will cause the program to crash if the range suddenly splits the string at a non-valid char boudary.
The correct method is to iterate over the string's char values using the .chars()
method :
for c in "hello world".chars() { println!("{}", c); }
The .bytes()
method will do the same returning each raw bytes depending on which is better for your code.
The type HashMap<K, V>
stores on the heap a mapping of keys of type K to values of type V. (Types K and V are generics). All the keys have to have the same type, and so do all the values.
HashMaps are useful when you want to look up data using keys of any type instead of vectors using index.
HashMaps are not brought automatically into scope and must be called for with
use std::collections:HashMap;
Creating a hashmap can be done using the new
keyword or using iterators such as collect
.
Check out Doc: Chapter 8.3 for more in depth explanations.
We can get a value out of the hashmap by providing its key to the get
method :
use std::collections::HashMap; let mut scores = HashMap::new(); scores.insert(String::from("Blue"), 10); scores.insert(String::from("Yellow"), 50); let team_name = String::from("Blue"); let score = scores.get(&team_name);
Here score
will have the value Some(&10)
There are multiple operations that can be done when updating the value already associated to a key : should the new value overwrite the previous or should it be ignored ? Check this section for more details.
Inside a package, each file will be a separate binary crate.
/src/main.rs
is always the primary binary crate and root of the package.
A crate groups related functionality together in a scope so the functionality is easy to share between multiple projects.
A package can have zero or maximum one library crate at /src/lib.rs
Modules let us organize code within a crate into groups for readability and reuse, as well as providing control over the privacy of items, e.g. whether an item is public (usable by outside code) or private (not available for outside code).
Calling a function inside a module can be done by using either absolute (starts from crate root) or relative (starts from current module) paths.
The keyword super
can be used to begin a relative path by the parent module instead of the current module. Doc: Chapter 7.3
Code inside child modules are private by default but can be made public using the pub
keyword. The pub
keyword should be set on any function, struct, enum or method that needs to be made publicly available to other code in other modules. Doc: Chapter 7.3
Note that for structs, if they are marked with pub
their fields need to explicitly be marked with pub
if they need to be made available, otherwise they remain private. In contrast, if an enum is made pub
all of it's variants are public.
The use
keyword can be used with absolute or relative paths to bring modules into the current scope and use lighter path calls afterwards. Doc: Chapter 7.4
The as
keyword can be used to rename a type brought into scope with a different name than the one it was defined with in it's module.
pub
can also be used before the use
keyword to re-export the name brought into scope as publically available.
To reduce the volume of many lines of use
code we can use nested paths :
use std::io::{self, Write};
and or the glob operator *
:
use std::collections::*;
Modules can be put in seperate files in the /src/
directory of the project. They are then called in the /src/main.rs
or /src/lib.rs
file using the same name as the file :
mod name_of_module;
Switching from an .expect()
call to a match
expression is how you generally move from crashing on an error to handling the error.
The panic!
macro can be used and will be called when the program encounters an unrecoverable error. It causes the program to crash and unwind, walking back up the stack and cleaning up the data.
An alternative abort
can be explicitly opt-into making for lighter binaries and leaving up to the OS to clean the stack after a crash.
Finally, a backtrace
can be used to display extended information leading to the exact point where the program panicked.
The Result
enum has two variants : Ok
and Err
. We can use a match
expression to go through these two variants and define behavior accordingly :
use std::fs::File; fn main() { let f = File::open("hello.txt"); let f = match f { Ok(file) => file, Err(error) => panic!("Problem opening the file: {:?}", error), }; }
You can specify more code to take different actions depending on different failure reasons. Doc: Chapter 9.2
Using match
works well but gets very verbose when you deal with multiple nested match constructs. The .unwrap()
and .expect()
methods can be useful to have a shorter implementation of similar code that match
would give. Check here.
Error handling inside a function can be propagated to the code calling that function. The ?
operator is sugar syntax for propagating errors back to the calling function. Doc: Chapter 9.2
Returning a Result
is a good default choice when defining a function that might fail : you can choose to attempt to recover in a way appropriate for the situation, or decide that an Err
value is unrecoverable, so it can call panic!
and crash the program.
Read Chapter 9.3 for more on when to panic!
or not.
Generics are useful for removing duplicate code and write functions or structs that can operate on abstract types instead of concrete types like i32
or String
or char
Generics often use one letter names like T
and are defined as such (function and struct) :
fn the_function<T>(value: &[T]) -> &T { //do something in the function with a slice of values of type T then return a reference to a value of type T } struct the_struct<T> { field1: T, field2: T, } //This struct has two fields of type T
Generics can be used in methods as well, or concrete types can be used to define methods that only apply if the generic is of that particular type and won't be available to any other types.
Generics in Rust are very efficient at run-time because the compiler replaces all calls using generics by definitions using concrete types in a process named monomorphization. More info
A trait tells the Rust compiler about functionality a particular type has and can share with other types.
Chapter 10.2 details how to create a trait like this one :
pub trait Summary { fn summarize(&self) -> String; }
Each type implementing the Summary
trait must define it's own custom behavior for the body of the summarize
method. The compiler will enforce this. Note that instead of a ;
semicolon a default behavior could be set on the trait declaration within {}
curly brackets.
To implement a trait on a type :
impl Summary for SomeStruct { fn summarize(&self) -> String { //some code } }
If the trait had a default behavior, this implementation will override it.
Default implementations can also call other methods in the same trait, even if those other methods don't have a default implementation.
Traits can be used to define functions that accept many different types who all implement a certain trait. The function is defined using that trait as a parameter, instead of concrete types. Check here for more info.
They are extensively used in the Rust Standard librarby and allow implementing a trait for any type that implements another trait. More info
You can create custom types and implement methods on them to ensure some conditions are met using the compiler Doc: Chapter 9.3
mut
const
and not let
keywordfor word in text.split_whitespace() {}
cargo doc --open
cargo run --example name_of_example
Discussion thread started in 2017 on GUI in Rust Official Rust Discourse forum