RethinkDb.Driver.FSharp
Concepts

If you are unfamiliar with RethinkDB, you may want to review RethinkDB's ten-minute guide to ReQL and their SQL to ReQL cheat sheet.

How Queries Are Built

The commands / functions / methods in the public API (starting with r.), with few exceptions, are building up an Abstract Syntax Tree (AST) for an eventual query; they are not actually executing the commands. This has several interesting implications, but the pertinent one for us is that queries can be developed incrementally. When a run/result or write command is issued with a connection, only then is the query sent to the server. This library extends this concept; by using argument ordering on functions and the Domain Specific Language (DSL) operators, we can configure behavior on the result of the query before the query is run. Once a connection is sent through the pipeline, it all executes as one.

This pattern is not restricted to the library; you can also write custom functions that can be a part of the execution. Let's say, for example, that you were want to validate a user's e-mail address and password; the password is stored as a salted hash (you are salting your hashes, right?), and the salt is also stored in the user's profile. Using the DSL, assuming the User table has a unique index on the email field, this could look something like…

// Type: string -> string -> IConnection -> Task<User option>
let validateUser (email : string) rawPassword =
    let checkPassword user =
        match user with
        | Some u ->
            if hashPassword u.salt rawPassword = u.passwordHash then
                Some { u with passwordHash = "" }
            else
                None
        | None -> None
    fun conn -> backgroundTask {
        let! user = rethink<User list> {
            withTable "User"
            getAll [ email ] (nameof email)
            limit 1
            result conn
        }
        return user |> List.tryHead |> Option.map checkPassword
    }

Given this definition, let isValid = validateUser "bob" "abc123" gives you IConnection -> Task<User option> - a value that has built up an AST, defined functionality to do after the database is queried, but has not yet queried the database.

This query can help us understand some other common patterns with the F# driver:

getAll always returns a list; to do a one-or-none query based on an index, you will need to use this pattern, or write a generic tryFirst function (signature (IConnection -> Task<'T list>) -> IConnection -> Task<'T option>) to encapsulate it.
backgroundTask is an F# CE that creates a Task<'T> as if .ConfigureAwait(false) had been called. It does not configure the current thread as the continuation thread (which can lead to deadlocks, especially in library code); it is not the equivalent of Task.Start.
result (and write) return Task<'T>s, as these are native to the underlying C# driver. The driver supports two other variants, though: resultAsync returns an F# Async<'T> type, and resultSync does the waiting in the driver, so it returns 'T and does not need to be awaited.
User, a record type, needs to have the [<CLIMutable>] attribute; unless you need the functionality, you can add [<NoComparison>] and [<NoEquality>] as well.

Query Size Considerations

RethinkDB will return multi-document results under 100kb all at once. If the query exceeds that, though, it will return a partial result; in these cases, the result functions will fail. There is an easy solution, though; retrieve a cursor instead. A cursor requires that you MoveNext through it, but this project contains a set of toList functions that will open the cursor, read through it and build a list, then dispose of the cursor. As you write queries, consider how much data you expect to be returned. In many cases, queries will never grow beyond what fits in a single response; for those that may, though, using a cursor is a future-proof way to retrieve that data. rethink<MyType list> { ... result } and rethink<MyType> { ... resultCursor; toList } both return MyType list; the latter will do so no matter how many or how large the resulting documents may be.

There are still memory limits, of course; you may also simply retrieve the cursor and take action on each item as you move through it. In these cases, be sure to define the cursor using use rather than let, as it needs to be disposed when it is no longer needed.

An Object, a Function, and a JavaScript String (Walk into a Bar…?)

Many ReQL commands support three basic parameter types: an object, a function, and a string with a JavaScript function. The C# driver maps .NET objects properly (including F#'s anonymous records, defined with {| |}), so you can call insert or replace using a custom type.

ReQL Functions

The C# driver defines a set of ReqlFunction[n] types, but the one you will most likely use is ReqlFunction1, which has the signature ReqlExpr -> obj. Generally speaking, you can create these with anonymous fun row -> ... statements (though the DSL has a small quirk in that regard). For functions passed to update or merge, the C# driver provides a HashMap, which can be used to modify the document before it is written.

A short example, using functions, which updates a user's “lastSeen” field with the current date/time, and a “priorSeen” field with what “lastSeen” was before making this update:

// within a task or backgroundTask CE
do! fromTable "User"
    |> get userId
    |> updateFunc (fun row ->
        r.HashMap("lastSeen", r.now()).With ("priorSeen", row["lastSeen"]))
    |> runWrite
    |> ignoreResult
    |> withConn conn

JavaScript Strings

RethinkDB has a JavaScript engine built in, and commands that take functions also accept strings that will be interpreted and executed in that engine. When creating queries using this technique, follow the JavaScript ReQL API instead of the Java documentation. Revisiting the example above, but implementing it with JavaScript, would look like:

do! fromTable "User"
    |> get userId
    |> updateJSWithOptArgs
        "function (row) { return { lastSeen: r.now(), priorSeen: row('lastSeen') } }"
        [ NonAtomic true ]
    |> runWrite
    |> ignoreResult
    |> withConn conn

Note the [ NonAtomic true ] at the end; this is a great segue to discuss…

Strongly-Typed Optional Arguments

Several ReQL commands have optional arguments that control the execution of the command. In the example above, RethinkDB requires the non-atomic option for a JavaScript update, as it cannot know whether it is a deterministic function until it parses it. (It knew the ReQL function was deterministic because of the AST tree built from that step.) A few other examples include:

between allows you to specify whether the upper or lower bounds are open or closed
getAll allows you to specify an index to use, instead of looking at the primary key
insert and update both provide durability arguments, and insert allows you to specify what action to take if the document already exists

In these cases, we would normally pass strings. In the F# driver, though, there are discriminated unions (DUs) that allow us to specify these options in a strongly-typed way. They should be discoverable in your IDE; however, you can always browse the definitions to see what is available in each context. One more quick example, using the between command:

// ...
    between 1 100 [ LowerBound Open; UpperBound Closed ]
// ...

Command Robustness

While some database drivers handle reconnections seamlessly, RethinkDB does not. The F# driver fills in this gap, though, with retry logic using Polly. In between retry attempts, the library runs .reconnect on the connection; in most transient-error cases, this provides seamless reconnect behavior. There are three general forms of retry functions / DSL operators:

withRetry allows the user to specify the retry attempts by passing delays.
withRetryOnce will immediately retry the command again (equivalent of withRetry [ 0 ]).
withRetryDefault will retry up to 3 times, waiting 200ms, 500ms, and 1s between attempts (equivalent of withRetry [ 200; 500; 1_000 ]).

As with the result and write commands, the default assumes Task<'T>; there are withAsyncRetry for Async<'T> and withSyncRetry for non-async 'T.

A quick example here, using both functions and the DSL, to retrieve all active users, retrying once:

// Type: IConnection -> Task<User list>
let activeUsersFunc = fromTable "User" |> filter {| isActive = "true" |} |> result<User list> |> withRetryOnce

// Type: IConnection -> Task<User list>
let activeUsersCE = rethink<UserList> { withTable "User"; filter [ "isActive", true ]; result; withRetryOnce }

Error Handling

While we'd all like to think that our queries are perfect just the way they are, this is (sadly) not always the case. On retrieval queries, RethinkDB raises exceptions; but, for write operations, errors are returned in a Result object (namespace RethinkDb.Driver.Model). This can be surprising! The F# driver provides two different forms of the write command:

write (and its variants) perform a write, then check for an error in the returned Result; if one is found it raises that as an exception. It returns the Result object, so if you are looking to get other information from it, such as the number of documents affected, you will be able to see this if the write succeeds.
writeResult (and its variants) perform the write and return the Result object; the caller is responsible for checking the FirstError property and handling it accordingly.

Which you use is really up to you; internally, write is simply an exception-raising wrapper around writeResult.

Those are the main concepts you need to know to be successful using this library. All that is left is to decide between functions or the DSL!

Loading…

RethinkDb.Driver.FSharp Concepts