A Breakneck Guide to the Nim Programming Language
Note: This article is a work-in-progress with many unfinished and entirely missing sections.
This notice will be removed in the future. For now, proceed with low expectations.
Nim is a systems programming language designed by Andreas Rumpf (Araq). It can be variously described as all of the following:
- A statically compiled Python with a comprehensive type system
- A memory-safe language that compiles down to and interfaces seamlessly with C
- A small core language that is extendable through powerful metaprogramming features
- A performant, easy-to-write language with move semantics and optional annotations
Code written in Nim looks like this:
import std/strformat
type Person = object
name: string
age: Natural # Ensures the age is positive
let people = @[
Person(name: "John", age: 45),
Person(name: "Kate", age: 30)
]
proc printAges(people: seq[Person]) =
for person in people:
echo fmt"{person.name} is {person.age} years old"
printAges(people)
Without further ado, let’s jump right into it.
Table of Contents
…auto-generated…
- Design Decisions
- Basic Syntax
- Control Flow
- primitive types
- Types
- Functions and Procedures
- procedures
- functions
- return and result
- parameters
- mutable parameters
- static parameters
- varargs
- generics
- methods
- converters
- iterators
- Metaprogramming
- templates
- macros
- Interop
- wrapping c (with c2nim)
- seamless ffi (with futhark)
- Memory Management
Design Decisions
[skip]significant whitespace
Perhaps the most obvious feature of Nim is its syntax: it looks like Python! Where’d all the brackets go? Statement blocks in Nim are determined through significant whitespace.
import std/sugar
func takesALambda(a: (string, string) -> string) =
... # implementation omitted
# standard syntax for multi-line lambdas
takesALambda(
func (a, b: string): string =
return a + b
)
# example using syntax sugar. types can be inferred!
takesALambda((a, b) => (a + b))
More specifically: significant whitespace is queried while not in a statement, to determine the scope of the next line. As indentation determines scope only outside of expressions: when writing multi-line expressions, you must break after an operator.
# break long lines like this...
hereAreSomeQuiteLongFunctions() + andYetAnother() +
thatReturnValuesForthwith() # indentation here may vary for aesthetics
# not like this! compilation error
hereAreSomeQuiteLongFunctions() + andYetAnother()
+ thatReturnValuesForthwith() # the above is a complete expression
Standard indentation practice uses two spaces, but any (consistent) number works.
Indenting with tabs in Nim is disallowed at the compiler level. I consider this an excellent design decision.
If you absolutely must: adding #? replace("\t", " ")
to the beginning of any Nim file will cause the compiler to treat tabulation characters as two spaces when compiling.
uniform function call syntax
A particularly unique feature of Nim (okay, not unique - D did it first) is what is known as uniform function call syntax (or UFCS for short).
In short, the following statements are equivalent under UFCS:
let a = "Hello, UFCS!"
echo a.len()
echo len(a)
echo a.len
This is made possible by the revelation that there isn’t all that much difference between a method call, a function call that takes a class, and an inherent property of a type. To quote a community member: “How many times have you pondered whether an operation should be a function, member, or method? It’s just a distracting detail with no benefit. And now you don’t need to care!”
This makes many things much nicer in practice. Function chaining, in particular, is now easy: a.foo().bar().baz()
style insensitivity
Nim is partially style insensitive.
In other words: identifiers in Nim are considered equal if - aside from their first character - they match with underscores removed and when taken to lowercase. This first character exception is so that code like let dog: Dog
can be written.
In code:
func same(a, b: string): bool =
a[0] == b[0] and
a.replace("_", "").toLowerAscii == b.replace("_", "").toLowerAscii
This has proven to be somewhat of a controversial feature. Critics say it hampers IDE support, breaks tooling, makes reading documentation harder, and can cause consistency issues. Proponents say language servers handle it fine, alternative tooling is available, you get used to reading documentation, and it helps with codebase consistency.
I like it quite a lot. Being able to adopt a consistent snake_case or camelCase style in your codebase regardless of what external libraries do is a great boon: and optional (but likely to become default) --stylecheck
flags can treat inconsistencies as warnings or errors. I would encourage anyone to try it out for a little while before flaming it.
Basic Syntax
[skip]let var const
There are three assignment keywords in Nim: let
(for immutable variables), var
(for mutable variables), and const
(for compile-time evaluated constants).
const testValues = [1, 0, 25]
let immutable = "This variable cannot be changed."
var mutable: string
mutable = stdin.readLine()
mutable = "Disregarding user input..."
The =
operator is used for assignment and reassignment.
Note that you can declare a (mutable) variable without assigning anything to it.
With the exception of ref
and ptr
values, it has a default value depending on the type: more on nil/notnil later.
comments
Comments are prefixed with the pound sign #
.
Documentation comments are prefixed with ##
.
Common convention (and the one used by nim doc
) is to put documentation comments directly beneath function signatures or type declarations.
Multiline comments are made with #[ ... ]#
and can be nested.
Control Flow
[skip]when / else
when
statements are statically (compile-time) evaluated if
statements. The else
keyword can be used with them.
While there’s not much else to say about when
statements themselves: the kinds of conditions they evaluate can be very helpful to see examples of.
when defined(macos):
when defined(js):
# check if the file is compiled with `-d:release`
when not defined(release):
# do some debug code here
# check if some code compiles with no errors
when compiles(3 + 4):
# the `+` operation is defined for integers
# check whether a library provides a certain feature
when not declared(strutils.toUpper):
# let's provide our own, then
case / of
The case
statement allows for compiler-checked pattern matching. A case
statement must handle all possibilities.
var x = stdin.readChar()
case x
of 'a'..'z', 'A'..'Z':
echo "A letter!"
of '0'..'9':
echo "A number!"
else:
echo "Something else!"
Idiomatic Nim does not put a colon after the case
parameter, nor indents the of
blocks. Both of those are, however, valid Nim. (This may change in the future).
for / in
todo
while
todo
block / break / continue
todo
try / finally / except / raise
todo
Type System
[skip]Nim has a static (ie. compile-time evaluated) and comprehensive type system.
- primitive types:
int
,float
,char
,bool
- collection types:
array[T]
,seq[T]
,set[T]
- range types:
range[T]
- generic types:
[T]
and[T: int | float]
- structured types: types declared with the
object
ortuple
keywords - enumerated types: types declared with the
enum
keyword - procedure types:
proc
,func
- reference types: automatically managed references declared with the
ref
keyword - pointer types: unsafe, manually managed pointers declared with the
ptr
keyword - parameter types:
var
,static
,typedesc
- distinct types: types declared with the
distinct
keyword - iterable types:
openarray
,string
,seq
,array
- optional types:
Option[T]
,Result[T, E]
variant types
By combining the case
statement with Nim’s object types, it is possible to create what are known as variant types. Variant types can have different fields depending on the value of the matched field. These are also known by a wide variety of other names: including tagged unions, choice types, discriminated unions, and sum types.
This is best explained with an example:
import std/tables
type NodeKind = enum
Text, Element
type Node = ref object
x, y: float
width, height: float
case kind
of Text:
text: string
of Element:
tag: string
attributes: Table[string, string]
children: seq[Node]
In many cases, variant types provide a more idiomatic alternative to generics. However, they have their limitations: field names may not be reused across cases, and the kind
of the variant is just a field within the object rather than a higher-level identifier as in Rust’s enums.
Functions and Procedures
[skip]procedures
What are typically known as functions in other languages are known in Nim as procedures.
Procedures use the proc
keyword, followed by a name, (optional) parameters, an (optional) return type, and the procedure body.
proc plusOrMinus(a: bool, b, c: int): int =
if a:
return b + c
else:
return b - c
# You don't even need parentheses if your procedure doesn't take parameters!
proc anotherProcedure =
echo "This procedure doesn't return anything."
functions
What Nim considers functions are typically known as pure functions in other languages. Functions are declared identically to procedures, only with the func
keyword instead of the proc
keyword. Functions are statically guaranteed by the compiler to have no side effects.
Side effects are considered to be any action modifying state outside of the function’s current scope. This includes modifying a global variable declared outside of the function, modifying : var T
parameters (more on those later), and I/O.
Side effects do not currently include the modification of a ref
type (more on those later), but this behavior is expected to change in the near future. See: {.experimental: "strictFuncs".}
As a special exception, the debugEcho
procedure is not considered to have side effects - despite dealing with I/O - through compiler magic. This is to allow for easier debugging of pure functions.
Note that this guide has used the terms procedure and function interchangeably, and will continue to do so.
return and result
The return
keyword returns the provided value and instantly exits the function, just like many other languages.
While you can simply return
out of a procedure any time, Nim also provides an implicit result variable.
The result
variable is initialized to the default value of the return type at the beginning of the function’s scope. If nothing has been explicitly return
ed by the end of the scope, the current value of result
is returned. This allows for writing cleaner, more idiomatic code.
An example of return
and result
is as follows:
discard
The discard
keyword allows for calling a statement that returns a value without doing anything with that value. This is best explained with an example:
proc returns(): bool =
echo "This procedure runs some code and returns a value."
return true
# fails: expression `returns()` is of type `bool` and has to be used...
returns()
# compiles: ... or discarded
discard returns()
mutable parameters
Parameters are immutable by default and passed by value (Copy
ed, for Rust programmers). The var
keyword is reused in function signatures to denote a mutable parameter. This is best explained with an example:
proc immutableParameters(a: bool) =
a = true
proc mutableParameters(a: var bool) =
a = true
let a = false
immutableParameters(a) # Error: `a` cannot be assigned to
mutableParameters(a) # Compiles fine, a is now true
The compiler will try and optimize copies into moves, and can be helped out some by the programmer. More on that later.
Note that ref
types behave somewhat unintuitively as parameters. A ref
type is simply an automatically-managed pointer to some memory. The pointer (memory address), not the memory itself, is copied into the function signature. This makes the data of ref types mutable without the var
annotation.
A var
on a ref
type parameter, then, lets you change what group of data that ref
type variable is pointing to. This is usually a misnomer.
static parameters
The static
keyword is also reused in function signatures to denote a parameter that must be known at compile time. This is best explained with an example:
Note that a: static T
is a: static[T]
.
varargs
The varargs
keyword allows you to specify that a function can take a dynamic number of parameters. Only one parameter can be varargs
, and it must be the last parameter in the function signature. This is best explained with an example:
Metaprogramming
[skip]todo.
Interop
[skip]todo.
Memory Management
[top]Nim’s memory management strategy is optimized reference counting with a cycle breaker. This may surprise some people, because one of Nim’s primary design goals is being efficient, and reference counting is typically considered to be less efficient than tracing GCs.
Nim’s version of reference counting (called ARC/ORC) brings to the table two things, however:
- Hard determinism
- Optimizing reference counts away with move semantics.
The former (hard determinism) comes from ARC/ORC not doing any sort of magic with deferred reference counts, and instead injecting destructors into the generated code. These injected destructors also provide other niceties, such as automagically-closing file streams.