programming:rust

Rust

A collection of reminders and tips from the excellent Rust Book.

Rust is a statically typed language and must know the types of all variables at compile time.

A scalar type represents a single value (integer, float, boolean, numeric, etc.) whereas compound types can group multiple values (even of different types) into one type (tuples, arrays, etc.).

Doc: Chapter 3.2

Rust is an expression based language.

Statements are instructions that perform some action and do not return a value.

let x = 8; //statement

Expressions evaluate to a resulting value.

{
    let x = 4; //statement
    x * 2 //expression
}

Expressions can be part of statements. Calling a function, or a macro, is an Expression.

Expressions do not end with a semicolon ; If you add one, it becomes a Statement and will not return a value.

Function signatures must declare the type of every parameter.

Function bodies contain a series of statements and optionally end in an expression.

Functions can return a value. We don't name it in the signature, but we have to indicate it's type.

Functions return the final expression in the block. You can return early by using the return keyword.

if must be followed by a condition evaluating to a bool. Rust will not automatically try to convert non-Boolean types to a Boolean.

Using multiple else if will execute only the first arm for which the condition is true and skip all others after.

if can follow a let declaring a variable to a certain value according to a certain condition (in which case, all possibilities must have the same type).

let number = if condition { 5 } else { 6 };

loop will run indefinitely until you explicitly call it to stop using break.

for loops are concise and safe, especially when associated to methods like .iter() or using a range :

let a = [1, 2, 3, 4];
 
//the following are the same :
for b in a.iter() {}
for b in (1..4) {}

Ownership is a unique property of Rust.

  • Each value in Rust has a variable called it's owner.
  • There can only be one owner.
  • When the owner goes out of scope, the value is dropped.

The scope is the range within a program for which an item is valid. Usually it is valid from where it is declared inside a { until the next } where it goes out of scope.

When an item goes out scope, rust automatically calls the drop function and returns the memory it occupied to the Ressource Allocator.

By repeating the use of the let keyword you can shadow a variable by re-using the same name.

In this case the value of the variable is the last one declared in the scope and the others are shadowed.

Shadowing also allows you to change the type of a variable.

Shadowing is different from marking a variable as mut because it does not allow re-assigning the value without the keyword let

let x = 5;
let x = x * 2; //the first value of x is used on the right here
let foo = "bar";
let foo = foo.len(); //foo equals 3 now

If we declare a variable to be the copy of another, then the first value is moved into the second and is no longer valid afterwards.

let s1 = String::from("Hello");
let s2 = s1; 
//s1 is no longer valid

This is only true for data that is on the heap. Integers and other types whose size are known at compile time are stored on the stack and are still valid after a move.

For data on the heap, this ensures there will be no double free error when both variables try to free the same memory.

Rust never makes deep copies of data (e.g. copy the whole data on the heap) so any automatic copying can be considered inexpensive in terms of runtime memory. Chapter 4.1

The ampersand & allows you to make references to data instead of moving it.

A reference does not own the data, so the data is not dropped when the reference goes out of scope.

As Variables, References can be made mutable, but are immutable by default.

A mutable reference &mut can only point to a mutable variable. In that case, you can only have one mutable reference to a variable in a given scope.

You cannot also have a mutable reference in a scope where you already have immutable references.

The scope of a reference ends when it is last used, which can be long before the closing curly bracket }.

Chapter 4.2

str is a string literal, immutable and stored on the stack. e.g. let word = “Hello”;

String is a string stored on the heap that can be mutable and can be instantiated using the from function e.g. let word = String::from(“Hello”);

&str is a string slice, that is, a part of a string. It's full notation is &s[0..4] using range notation. Chapter 4.3

String literals are string slices because they point to that particular part of the binary. String slices store a reference to the first element and a length. You can take a full slice by omitting a start and end point : &s[..]

Slices are also valid on other types like &i32 :

let a = [1, 2, 3, 4, 5];
let s = &a[1..3]; //s is [2, 3]

Structs are flexible objects to tie together pieces of data that are significant to each other.

Creating

A struct is made up of fields which are named. Fields can only be marked as mutable if the entire struct is mutable.

Structs are declared using the struct keyword along with the name of the Struct and it's fields. Instances of the struct can then be created.

Fields can be accessed using dot notation : struct_instance.field_name.

If we use an expression to return an instance of a struct in a build function, and if the names of the variables and the names of the parameters of the build function are the same we can use a shorthand notation to avoid repetetion. Doc: Chapter 5.1.

The syntax ..another_struct allows us to conveniently instantiate a struct by using another one's fields.

Tuple structs can be used to conveniently name a tuple without naming each field as we would in a struct. Doc: Chapter 5.1.

Displaying

You can add functionnalities, such as std::fmt::Debug to a struct by adding the #[derive(Debug)] annotation above the struct declaration. This allows the use of {:?} or {:#?} (for pretty-print) in a println! macro to display the struct for debugging purposes. Chapter 5.2

Methods

You can tie operations on a struct to the struct itself by defining methods inside an impl (implementation) block. Doc : Chapter 5.3.

A method's first parameter is always self and methods are called using the . syntax. Methods can take ownership of self, borrow &self immutably, or borrow &mut self mutably, just as they can any other parameter.

Associated functions

Associated functions can also be created in the impl block and don't have to take self as a first parameter. Associated functions are different from methods and are called using the :: syntax.

Enums define a type by enumerating it's possible variants.

An enum can be one and only one of its variants.

Creation

Refer to Doc: Chapter 6.1 for the creation of enums.

You can put any kind of data inside an enum variant: strings, numeric types, structs, etc. and the different variants do not require to have the same type.

Methods

Just like structs, enums can have methods associated by using impl blocks and the methods can be called using the . notation.

'Option' enum

A very common enum is the Option enum defined by the standard library with it's variant Some and None

enum Option<T> {
    Some(T),
    None,
}

The Option enum exists because Rust does not implement the null value like other programming languages. Instead, if a value can either be null or non-null we use an Option<T> that can hold the value of type T or can hold nothing. This forces developers to avoid the common programming mistake of not handling cases when a value could be nothing. In order to use a value that is nothing, a Rustacean has to opt-in using an Option enum.

'match' workflow

In general, we use an Option<T> when we want to handle all the cases. The match expression is a control flow construct that does just this when used with enums: it will run different code depending on which variant of the enum it has, and that code can use the data inside the matching value.

Combining match and enum is useful in many situations: match against an enum, bind a variable to the data inside, and then execute code based on it.

Because match is exhaustive, all possibilities must be described. The catchall _ can be used to catch all remaining possibilities that have not explicitly been handled and assign them to one common expression.

Doc: Chapter 6.2

'if let' workflow

The if let syntax is more reader-friendly when defining code that only executes on one variant of an enum. else can be used with an if let to define action for all the other variants if necessary.

Doc: Chapter 6.3

Doc: Chapter 8.1

Vectors allow you to store multiple values in a single data structure and have them next to each other in memory (on the heap). It can grow or shrink in size.

Vectors can only store values of the same type.

Like any variable, when the vector goes out of scope, all it's content is dropped.

Creating

let v = Vec::new();
//or
let v = vec![1, 2, 3]; //using vec! macro and letting compiler infer type.

Adding values

let mut v<i32> = Vec::new();
v.push(5);
v.push(88);
//The compiler can infer the i32 type from the values pushed in the vector so the <i32> is not an obligation

Reading Elements

let third: &i32 = &v[2]; //reference to third element, count starts from 0
//or
match v.get(2) {
    Some(third) => //code
    None => //code
}

Doc: Chapter 8.2

String is a growable, mutable, owned, UTF-8 encoded string type. &str refers to a borrowed string slice.

Creation

let data = "initial content";
let s = data.to_string();
//or
let s = "initial content".to_string();
//or
let s = String::from("initial content");

Concatenation

To concatenate two existing strings you can use the + operator or the format! macro.

Check out Doc: Chapter 8.2 for more in depth explanations of what happens and how to use it.

Indexing

Note that Rust does not allow indexing of a string, e.g. the following will not compile :

let s = String::from("hello");
let h = s[0]; //Error !

The reason for this is the internal representation of a String where a UTF-8 character could be more than 1 byte long. Doc: Chapter 8.2

Indexing can still be used with caution using an explicit range &[0..2] but will cause the program to crash if the range suddenly splits the string at a non-valid char boudary.

Iterating

The correct method is to iterate over the string's char values using the .chars() method :

for c in "hello world".chars() {
    println!("{}", c);
}

The .bytes() method will do the same returning each raw bytes depending on which is better for your code.

The type HashMap<K, V> stores on the heap a mapping of keys of type K to values of type V. (Types K and V are generics). All the keys have to have the same type, and so do all the values.

HashMaps are useful when you want to look up data using keys of any type instead of vectors using index.

HashMaps are not brought automatically into scope and must be called for with

use std::collections:HashMap;

Creating

Creating a hashmap can be done using the new keyword or using iterators such as collect.

Check out Doc: Chapter 8.3 for more in depth explanations.

Accessing Values

We can get a value out of the hashmap by providing its key to the get method :

use std::collections::HashMap;
 
let mut scores = HashMap::new();
 
scores.insert(String::from("Blue"), 10);
scores.insert(String::from("Yellow"), 50);
 
let team_name = String::from("Blue");
let score = scores.get(&team_name);

Here score will have the value Some(&10)

Updating values

There are multiple operations that can be done when updating the value already associated to a key : should the new value overwrite the previous or should it be ignored ? Check this section for more details.

  • Packages: A Cargo feature that lets you build, test, and share crates
  • Crates: A tree of modules that produces a library or executable
  • Modules and use: Let you control the organization, scope, and privacy of paths
  • Paths: A way of naming an item, such as a struct, function, or module

Inside a package, each file will be a separate binary crate.

/src/main.rs is always the primary binary crate and root of the package.

A crate groups related functionality together in a scope so the functionality is easy to share between multiple projects.

A package can have zero or maximum one library crate at /src/lib.rs

Modules let us organize code within a crate into groups for readability and reuse, as well as providing control over the privacy of items, e.g. whether an item is public (usable by outside code) or private (not available for outside code).

Paths

Calling a function inside a module can be done by using either absolute (starts from crate root) or relative (starts from current module) paths.

The keyword super can be used to begin a relative path by the parent module instead of the current module. Doc: Chapter 7.3

Making code publicly available

Code inside child modules are private by default but can be made public using the pub keyword. The pub keyword should be set on any function, struct, enum or method that needs to be made publicly available to other code in other modules. Doc: Chapter 7.3

Note that for structs, if they are marked with pub their fields need to explicitly be marked with pub if they need to be made available, otherwise they remain private. In contrast, if an enum is made pub all of it's variants are public.

The 'use' keyword

The use keyword can be used with absolute or relative paths to bring modules into the current scope and use lighter path calls afterwards. Doc: Chapter 7.4

The as keyword can be used to rename a type brought into scope with a different name than the one it was defined with in it's module.

pub can also be used before the use keyword to re-export the name brought into scope as publically available.

To reduce the volume of many lines of use code we can use nested paths :

use std::io::{self, Write};

and or the glob operator * :

use std::collections::*;

Putting modules in separate files

Doc: Chapter 7.5

Modules can be put in seperate files in the /src/ directory of the project. They are then called in the /src/main.rs or /src/lib.rs file using the same name as the file :

mod name_of_module;

Switching from an .expect() call to a match expression is how you generally move from crashing on an error to handling the error.

Unrecoverable errors : panic!

The panic! macro can be used and will be called when the program encounters an unrecoverable error. It causes the program to crash and unwind, walking back up the stack and cleaning up the data.

An alternative abort can be explicitly opt-into making for lighter binaries and leaving up to the OS to clean the stack after a crash.

Finally, a backtrace can be used to display extended information leading to the exact point where the program panicked.

Recoverable errors : Result

The Result enum has two variants : Ok and Err. We can use a match expression to go through these two variants and define behavior accordingly :

use std::fs::File;
 
fn main() {
    let f = File::open("hello.txt");
 
    let f = match f {
        Ok(file) => file,
        Err(error) => panic!("Problem opening the file: {:?}", error),
    };
}

You can specify more code to take different actions depending on different failure reasons. Doc: Chapter 9.2

Using match works well but gets very verbose when you deal with multiple nested match constructs. The .unwrap() and .expect() methods can be useful to have a shorter implementation of similar code that match would give. Check here.

Error handling inside a function can be propagated to the code calling that function. The ? operator is sugar syntax for propagating errors back to the calling function. Doc: Chapter 9.2

Returning a Result is a good default choice when defining a function that might fail : you can choose to attempt to recover in a way appropriate for the situation, or decide that an Err value is unrecoverable, so it can call panic! and crash the program.

Read Chapter 9.3 for more on when to panic! or not.

Generics are useful for removing duplicate code and write functions or structs that can operate on abstract types instead of concrete types like i32 or String or char

Generics often use one letter names like T and are defined as such (function and struct) :

fn the_function<T>(value: &[T]) -> &T {
  //do something in the function with a slice of values of type T then return a reference to a value of type T
}
 
struct the_struct<T> {
  field1: T,
  field2: T,
}
//This struct has two fields of type T

Generics can be used in methods as well, or concrete types can be used to define methods that only apply if the generic is of that particular type and won't be available to any other types.

Generics in Rust are very efficient at run-time because the compiler replaces all calls using generics by definitions using concrete types in a process named monomorphization. More info

A trait tells the Rust compiler about functionality a particular type has and can share with other types.

Defining

Chapter 10.2 details how to create a trait like this one :

pub trait Summary {
    fn summarize(&self) -> String;
}

Each type implementing the Summary trait must define it's own custom behavior for the body of the summarize method. The compiler will enforce this. Note that instead of a ; semicolon a default behavior could be set on the trait declaration within {} curly brackets.

To implement a trait on a type :

impl Summary for SomeStruct {
    fn summarize(&self) -> String {
        //some code
    }
}

If the trait had a default behavior, this implementation will override it.

Default implementations can also call other methods in the same trait, even if those other methods don't have a default implementation.

Traits as Parameters

Traits can be used to define functions that accept many different types who all implement a certain trait. The function is defined using that trait as a parameter, instead of concrete types. Check here for more info.

Blanket implementations

They are extensively used in the Rust Standard librarby and allow implementing a trait for any type that implements another trait. More info

You can create custom types and implement methods on them to ensure some conditions are met using the compiler Doc: Chapter 9.3

  • Constants are not allowed to be made mutable with mut
  • Constants are declared using const and not let keyword
  • Constants types must be annotated
  • Constants can be declared in any scope, including the global scope.
  • Constants can only be set to a constant expression, not the result of a function call or any other value.

Doc: Chapter 3.1

    for word in text.split_whitespace() {}

Build documentation of a project

cargo doc --open

Show example from /example directory

cargo run --example name_of_example

Discussion thread started in 2017 on GUI in Rust Official Rust Discourse forum

  • Cursive Text User Interface (TUI) : Have not been able to pass data to the application abstractions
  • Azul Webkit based GUI : More personal documentation here
  • programming/rust.txt
  • Last modified: 2021/05/25 22:07
  • by mh