BitBadger.Documents
Getting Started
Overview
Each library has three different ways to execute commands:
- Functions/methods that have no connection parameter at all; for these, each call obtains a new connection. (Connection pooling greatly reduced this overhead and churn on the database)
- Functions/methods that take a connection as the last parameter; these use the given connection to execute the commands.
- Extensions on the
NpgsqlConnection
orSqliteConnection
type (native for both C# and F#); these are the same as the prior ones, and the names follow a similar pattern (ex.Count.All()
is exposed asconn.CountAll()
).
This provides flexibility in how connections are managed. If your application does not care about it, configuring the library is all that is required. If your application generally does not care, but needs a connection on occasion, one can be obtained from the library and used as required. If you are developing a web application, and want to use one connection per request, you can register the library's connection functions as a factory, and have that connection injected. We will cover the how-to below for each scenario, but it is worth considering before getting started.
A note on functions: the F# functions use
camelCase
, while C# calls usePascalCase
. To cut down on the noise, this documentation will generally use the C#Count.All
form; know that this isCount.all
for F#,conn.CountAll()
for the C# extension method, andconn.countAll
for the F# extension.
Namespaces
C#
using BitBadger.Documents; using BitBadger.Documents.[Postgres|Sqlite];
F#
open BitBadger.Documents open BitBadger.Documents.[Postgres|Sqlite]
For F#, this order is significant; both namespaces have modules that share names, and this order will control which one shadows the other.
Configuring the Connection
The Connection String
Both PostgreSQL and SQLite use the standard ADO.NET connection string format (Npgsql
docs, Microsoft.Data.Sqlite
docs). The usual location for these is an appsettings.json
file, which is then parsed into an IConfiguration
instance. For SQLite, all the library needs is a connection string:
// C#, SQLite // ... var config = ...; // parsed IConfiguration Sqlite.Configuration.UseConnectionString(config.GetConnectionString("SQLite")); // ...
// F#, SQLite // ... let config = ...; // parsed IConfiguration Configuration.useConnectionString (config.GetConnectionString("SQLite")) // ...
For PostgreSQL, the library needs an NpgsqlDataSource
instead. There is a builder that takes a connection string and creates it, so it still is not a lot of code: (although this implements IDisposable
, do not declare it with using
or use
; the library handles disposal if required)
// C#, PostgreSQL // ... var config = ...; // parsed IConfiguration var dataSource = new NpgsqlDataSourceBuilder(config.GetConnectionString("Postgres")).Build(); Postgres.Configuration.UseDataSource(dataSource); // ...
// F#, PostgreSQL // ... let config = ...; // parsed IConfiguration let dataSource = new NpgsqlDataSourceBuilder(config.GetConnectionString("Postgres")).Build() Configuration.useDataSource dataSource // ...
The Connection
- If the application does not care to control the connection, use the methods/functions that do not require one.
- To retrieve an occasional connection (possibly to do multiple updates in a transaction), the
Configuration
static class/module for each implementation has a way. (For both of these, define the result withusing
oruse
so that they are disposed properly.)- For PostgreSQL, the
DataSource()
method returns the configuredNpgsqlDataSource
instance; from this,OpenConnection[Async]()
can be used to obtain a connection. - For SQLite, the
DbConn()
method returns a new, openSqliteConnection
.
- For PostgreSQL, the
- To use a connection per request in a web application scenario, register it with DI.
// C#, PostgreSQL builder.Services.AddScoped<NpgsqlConnection>(svcProvider => Postgres.Configuration.DataSource().OpenConnection()); // C#, SQLite builder.Services.AddScoped<SqliteConnection>(svcProvider => Sqlite.Configuration.DbConn());
// F#, PostgreSQL let _ = builder.Services.AddScoped<NpgsqlConnection(fun sp -> Configuration.dataSource().OpenConnection()) // F#, SQLite let _ = builder.Services.AddScoped<SqliteConnection>(fun sp -> Configuration.dbConn ())
After registering, this connection will be available on the request context and can be injected in the constructor for things like Razor Pages or MVC Controllers.
Configuring Document IDs
Field Name
A common .NET pattern when naming unique identifiers for entities / documents / etc. is the name Id
. By default, this library assumes that this field is the identifier for your documents. If your code follows this pattern, you will be happy with the default behavior. If you use a different property, or implement a custom serializer to modify the JSON representation of your documents' IDs, though, you will need to configure that field name before you begin calling other functions or methods. A great spot for this is just after you configure the connection string or data source (above). If you have decided that the field “Name” is the unique identifier for your documents, your setup would look something like…
// C#, All Configuration.UseIdField("Name");
// F#, All Configuration.useIdField "Name"
Setting this will make EnsureTable
create the unique index on that field when it creates a table, and will make all the ById
functions and methods look for data ->> 'Name'
instead of data ->> 'Id'
. JSON is case-sensitive, so if the JSON is camel-cased, this should be configured to be id
instead of Id
(or name
to follow the example above).
Generation Strategy
The library can also generate IDs if they are missing. There are three different types of IDs, and each case of the AutoId
enumeration/discriminated union can be passed to Configuration.UseAutoIdStrategy()
to configure the library.
Number
generates a “max ID plus 1” query based on the current values of the table.Guid
generates a 32-character string from a Globally Unique Identifier (GUID), lowercase with no dashes.RandomString
generates random bytes and converts them to a lowercase hexadecimal string. By default, the string is 16 characters, but can be changed viaConfiguration.UseIdStringLength()
. (You can also useAutoId.GenerateRandomString(length)
to generate these strings for other purposes; they make good salts, transient keys, etc.)
All of these are off by default (the Disabled
case). Even when ID generation is configured, though, only IDs of 0 (for Number
) or empty strings (for Guid
and RandomString
) will be generated. IDs are only generated on Insert
.
Numeric IDs are a one-time decision. In PostgreSQL, once a document has a non-numeric ID, attempts to insert an automatic number will fail. One could switch from numbers to strings, and the IDs would be treated as such (
"33"
instead of33
, for example). SQLite does a best-guess typing of columns, but once a string ID is there, the “max + 1” algorithm will not return the expected results.
Ensuring Tables and Indexes Exist
Both PostgreSQL and SQLite store data in tables and can utilize indexes to retrieve that data efficiently. Each application will need to determine the tables and indexes it expects.
To discover these concepts, let's consider a naive example of a hotel chain; they have several hotels, and each hotel has several rooms. While each hotel could have its rooms as part of a Hotel
document, there would likely be a lot of contention when concurrent updates for rooms, so we will put rooms in their own table. The hotel will store attributes like name, address, etc.; while each room will have the hotel's ID (named Id
), along with things like room number, floor, and a list of date ranges where the room is not available. (This could be for customer reservation, maintenance, etc.)
(Note that all of the “ensure” methods/functions below use the IF NOT EXISTS
clause; they are safe to run each time the application starts up, and will do nothing if the tables or indexes already exist.)
PostgreSQL
We have a few options when it comes to indexing our documents. We can index a specific JSON field; each table's primary key is implemented as a unique index on the configured ID field. We can also use a GIN index to index the entire document, and that index can even be optimized for a subset of JSON Path operators.
Let's create a general-purpose index on hotels, a “HotelId” index on rooms, and an optimized document index on rooms.
// C#, Postgresql await Definition.EnsureTable("hotel"); await Definition.EnsureDocumentIndex("hotel", DocumentIndex.Full); await Definition.EnsureTable("room"); // parameters are table name, index name, and fields to be indexed await Definition.EnsureFieldIndex("room", "hotel_id", new[] { "HotelId" }); await Definition.EnsureDocumentIndex("room", DocumentIndex.Optimized);
// F#, PostgreSQL do! Definition.ensureTable "hotel" do! Definition.ensureDocumentIndex "hotel" Full do! Definition.ensureTable "room" do! Definition.ensureFieldIndex "room" "hotel_id" [ "HotelId" ] do! Definition.ensureDocumentIndex "room" Optimized
SQLite
For SQLite, the only option (outside of some quite complex techniques) for JSON indexes are indexes on fields. Just as traditional relational indexes, these fields can be ordered by query. In our example, if we indexed our rooms on hotel ID and room number, it could also be used for efficient retrieval just by hotel ID.
Let's create hotel and room tables, then index rooms by hotel ID and room number.
// C#, SQLite await Definition.EnsureTable("hotel"); await Definition.EnsureTable("room"); await Definition.EnsureIndex("room", "hotel_and_nbr", new[] { "HotelId", "RoomNumber" });
// F# do! Definition.ensureTable "hotel" do! Definition.ensureTable "room" do! Definition.ensureIndex "room" "hotel_and_nbr", [ "HotelId"; "RoomNumber" ]
Now that we have tables, let's use them!