Lightning Stream syncs LMDB databases through S3 buckets between multiple servers, including PowerDNS Authoritative server 4.8+ LMDBs

  • By PowerDNS
  • Last update: Apr 29, 2023
  • Comments: 12

Lightning Stream

User documentation can be found here

Go build Documentation build Go Reference

Lightning Stream is a tool to sync changes between a local LMDB (Lightning Memory-Mapped Database) and an S3 bucket in near real-time. If the application schema is compatible, this can be used in a multi-writer setup where any instance can update any data, with a global eventually-consistent view of the data in seconds.

Our main target application is the sync of LMDB databases in the PowerDNS Authoritative Nameserver (PDNS Auth). We are excited about how Lightning Stream simplifies running multiple distributed PowerDNS Authoritative servers, with full support for keeping DNSSEC keys in sync. Check the Getting Started section to understand how you can use Lightning Stream together with the PowerDNS Authoritative server.

Its use is not limited to the PowerDNS Authoritative server, however. Lightning Stream does not make any assumptions about the contents of the LMDB, and can be used to sync LMDBs for other applications, as long as the data is stored using a compatible schema.

Basic Operation

Lightning Stream is deployed next to an application that uses an LMDB for its data storage:

Overview

Its operation boils down to the following:

  • Whenever it detects that the LMDB has changed, it writes a snapshot of the data to an S3 bucket.
  • Whenever it sees a new snapshot written by a different instance in the S3 bucket, it downloads the snapshot and merges the data into the local LMDB.

The merge of a key is performed based on a per-record last-modified timestamp: the most recent version of the entry wins. Deleted entries are cleared and marked as deleted, together with their deletion timestamp. This allows Lightning Stream to provide Eventual Consistency across all nodes.

If the application uses a carefully designed data schema, this approach can be used to support multiple simultaneously active writers. In other instances, it can often be used to sync data from one writer to multiple read-only receivers. Or it can simply create a near real-time backup of a single instance.

Building

At the moment of writing, this project requires Go 1.19. Please check the go.mod file for the current version.

To install the binary in a given location, simply run:

GOBIN=$HOME/bin go install ./cmd/lightningstream/

Or run ./build.sh to install it in a bin/ subdirectory of this repo.

Easy cross compiling is not supported, because the LMDB bindings require CGo.

Example in Docker Compose

This repo includes an example of syncing the PowerDNS Authoritative Nameserver LMDB. It runs two DNS servers, each with their own syncer, syncing to a bucket in a MinIO server.

The Lightning Stream config used can be found in docker/pdns/lightningstream.yaml. Note that the config file contents can reference environment variables.

To get it up and running:

docker-compose up -d

You may need to rerun this command once, because of a race condition creating the LMDBs.

To see the services:

docker-compose ps

This should show output like:

         Name                        Command               State                                    Ports
-------------------------------------------------------------------------------------------------------------------------------------------
lightningstream_auth1_1   /run.sh                          Up      127.0.0.1:4751->53/tcp, 127.0.0.1:4751->53/udp, 127.0.0.1:4781->8081/tcp
lightningstream_auth2_1   /run.sh                          Up      127.0.0.1:4752->53/tcp, 127.0.0.1:4752->53/udp, 127.0.0.1:4782->8081/tcp
lightningstream_minio_1   /usr/bin/docker-entrypoint ...   Up      127.0.0.1:4730->9000/tcp, 127.0.0.1:4731->9001/tcp
lightningstream_sync1_1   /usr/local/bin/lightningst ...   Up      127.0.0.1:4791->8500/tcp
lightningstream_sync2_1   /usr/local/bin/lightningst ...   Up      127.0.0.1:4792->8500/tcp

Open one terminal with all the logs:

docker-compose logs

Then in another terminal call these convenience scripts, with a delay between them to allow for syncing:

docker/pdns/pdnsutil -i 1 create-zone example.org
docker/pdns/pdnsutil -i 1 secure-zone example.org
docker/pdns/pdnsutil -i 1 set-meta example.org foo bar
docker/pdns/pdnsutil -i 2 generate-tsig-key example123 hmac-sha512

sleep 2

docker/pdns/curl-api -i 2 /api/v1/servers/localhost/zones/example.org
docker/pdns/curl-api -i 2 /api/v1/servers/localhost/zones/example.org/metadata
docker/pdns/curl-api -i 1 /api/v1/servers/localhost/tsigkeys

To view a dump of the LMDB contents:

docker/pdns/dump-lmdb -i 1
docker/pdns/dump-lmdb -i 2

You can browse the snapshots in MinIO at http://localhost:4731/buckets/lightningstream/browse (login with minioadmin / minioadmin).

Open Source

This is the documentation for the Open Source edition of Lightning Stream. For more information on how we provide support for Open Source products, please read our blog post on this topic.

PowerDNS also offers an Enterprise edition of Lightning Stream that includes professional support, advanced features, deployment tooling for large deployments, Kubernetes integration, and more.

Download

lightningstream.zip

Comments(12)

  • 1

    Upgrade docker compose to Auth 4.8 with native schema

    Update the configuration to use Auth 4.8 with native LMDB schema.

    ~~It currently points to the latest 'master' image of Auth, we need to change it to the right one once there is a final release.~~

    ~~Currently blocked by #33.~~

  • 2

    Waiting for initial receiver listing: "file does not exist" error

    When starting up a fresh docker compose setup, LS hangs on this error:

    lightningstream-sync1-1  | level=info msg="[main          ] Waiting for initial receiver listing" 
                               db=main error="file does not exist" instance=instance-1
    

    I suspect that Simpleblob is returning the "file does not exist" instead of an empty listing.

  • 3

    Go tests: run in Github Actions, and make the sync tests reliable

    Run Go tests in Github Actions.

    To make the tests pass consistently:

    Explicitly pass the lmdb.Env to the syncer, so that we have more control over its lifetime and avoid opening and closing it multiple times, which can cause crashes in the LMDB C code.

    Rewrite the sync tests to wait for expected values, instead of relying on arbitrary sleeps that make the test flaky and slow.

    Do not close the lmdb.Env in the sync tests, because for some reason it causes random SEGFAULTs on Linux (not on macOS).

    Upgrade golangci-lint for Go 1.20

  • 4

    Add documentation

    Writing documentation using mkdocs.

    To view them locally from a checkout of the repo:

    python3 -m venv .venv
    .venv/bin/pip install -r docs/requirements.txt
    .venv/bin/mkdocs serve
    
  • 5

    Bump golang.org/x/net from 0.1.0 to 0.7.0

    Bumps golang.org/x/net from 0.1.0 to 0.7.0.

    Commits
    • 8e2b117 http2/hpack: avoid quadratic complexity in hpack decoding
    • 547e7ed http2: avoid referencing ResponseWrite.Write parameter after returning
    • 39940ad html: parse comments per HTML spec
    • 87ce33e go.mod: update golang.org/x dependencies
    • 415cb6d all: fix some comments
    • 7e3c19c all: correct typos in comments
    • 296f09a http2: case insensitive handling for 100-continue
    • f8411da nettest: fix tests on dragonfly and js/wasm
    • 8e0e7d8 go.mod: update golang.org/x dependencies
    • 7805fdc http2: rewrite inbound flow control tracking
    • Additional commits viewable in compare view

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the Security Alerts page.
  • 6

    Add check-spelling

    This adds check-spelling, the workflow used to build #39.

    As configured here, it's using the latest release (v0.0.21), without commenting enabled (reporting is view GitHub Step Summary) and without Sarif reporting.

    It's always a good idea to review the basic expect.txt file to make sure there aren't any obvious typos that my initial pass didn't correct. If terms are permanent, they can be moved/promoted to allow.txt instead of remaining in expect.txt (see the advice.md file for some prose on this point).

    advice.md is repository specific content that's included in reports, so please feel free to adjust it based on your repository's needs.

  • 7

    Reduce memory used for when loading a snapshot

    The generated protobuf code made copies of all key/value byte slices. This patches the generated code to directly return the byte slices.

    In local testing, this reduced memory allocations when loading snapshots by half. PR #31 will provide bigger savings later.

    Next step is to rewrite the protobuf handling to iterate over the protobuf data instead of creating the []KV slices, but that requires a significant rewrite.

    Additionally, this enables the net/http/pprof endpoints for debugging.

    Output of go tool pprof http://127.0.0.1:9152/debug/pprof/heap and then top 10 below, with a test set of 1M domains and 6M records.

    Before:

          flat  flat%   sum%        cum   cum%
     2047.55MB 56.26% 56.26%  2722.09MB 74.80%  powerdns.com/platform/lightningstream/snapshot.(*DBI).Unmarshal
      674.54MB 18.54% 98.45%   674.54MB 18.54%  powerdns.com/platform/lightningstream/snapshot.(*KV).Unmarshal
    

    After:

     1378.77MB 52.20% 52.20%  1378.77MB 52.20%  powerdns.com/platform/lightningstream/snapshot.(*DBI).Unmarshal
    
  • 8

    Experimental pdns-v5-fix-duplicate-domains command

    The experimental pdns-v5-fix-duplicate-domains command can be used to fix duplicate domain entries in pdns auth 4.8 schemaversion 5 LMDBs.

    If this is ever merged, it will likely be removed again once PDNS Auth has an internal mechanism to resolve this.

    Example run:

    $ lightningstream --log-level debug -c pdns-native.yaml experimental pdns-v5-fix-duplicate-domains -d main --dangerous-do-rename
    DEBU[0000] Running                                       version=dev
    DEBU[0000] Scan                                          domain=dup.example domain_id=428582697
    DEBU[0000] Scan                                          domain=dup.example domain_id=579235200
    ERRO[0000] Duplicate domain entry!                       domain=dup.example domain_id=579235200 header_ts="2023-03-31 17:57:02.07626496 +0800 CST" keeping_oldest_id=428582697 prev_domain_id=428582697 prev_header_ts="2023-03-31 17:56:59.632763904 +0800 CST" renaming_newest_id=579235200
    WARN[0000] PATCHING                                      domain=dup.example domain_id=579235200 flags=1 key="..example.dup.\".m. [00 0c 65 78 61 6d 70 6c 65 00 64 75 70 00 22 86 6d 80]" val=".Q|.N..P................ [17 51 7c ec 4e 99 dc 50 00 00 00 00 00 00 00 0d 00 01 00 00 00 00 00 00] (2023-03-31T11:18:10.358738Z)"
    WARN[0000] PATCHING                                      domain=dup.example.dup-579235200.invalid domain_id=579235200 flags=0 key=".\"invalid.dup-579235200.example.dup.\".m. [00 22 69 6e 76 61 6c 69 64 00 64 75 70 2d 35 37 39 32 33 35 32 30 30 00 65 78 61 6d 70 6c 65 00 64 75 70 00 22 86 6d 80]" val=".Q|.N.YP................ [17 51 7c ec 4e 9a 59 50 00 00 00 00 00 00 00 0d 00 00 00 00 00 00 00 00] (2023-03-31T11:18:10.35877Z)"
    DEBU[0000] Scan                                          domain=dup.example.dup-579235200.invalid domain_id=579235200
    INFO[0000] Done
    
    
    The following zones need to be removed with `pdnsutil delete-zone ZONE`:
    
    - dup.example.dup-579235200.invalid
    
    Note that these will NOT show up in list-all-zones. Removing the zones will also not remove it from this list.
    

    Additionally, this switches the Linux binary builds from Ubuntu 22.04 to 20.04 for greater compatibility with other distributions. This is needed, because we use CGo for the LMDB bindings.

  • 9

    Simpleblob: clear error when the bucket does not exist

    Upgrade simpleblob when https://github.com/PowerDNS/simpleblob/pull/31 is merged and released, to get clear errors when buckets do not exists, and in other failure scenarios.

  • 10

    Check spelling updates

    This is a bundle of fairly unrelated things.

    Note that the commit messages for some of these changes are incredibly long.

    The m_data comment update corresponds to this: https://github.com/check-spelling/spell-check-this/commit/4f81233290253fc998a6ce3e9fe7b638d672dd86 @Habbie asked me if I knew where it came from, and at the time it was written, I didn't, it was just something that was mentioned by https://github.com/nasa/fprime as a thing they'd like to be able to handle (https://github.com/nasa/fprime/discussions/855). It turns out their repository actually did indicate which vendor had the issue (https://github.com/nasa/fprime/commit/aa4cb0006fd6d8abae086d521fb5fa70d9ee7f66), but I didn't check for that and merely noted it as a theoretical case. I've now updated the check-spelling wiki entry to provide more information about its pedigree.

    The expect updates would be triggered the next time someone introduced a misspelling, but since I'm already making a PR, I'm bundling it here. (They're mostly the result of #44's changes which excluded checking the .github/workflows directory.) Fwiw, this is a conscious tradeoff. check-spelling aims for a form of "eventual consistency". It's more or less ok to temporarily have a couple of stray expected items as they'll be cleaned up the next time someone tries to add an unexpected word. If there's ever a case where you really don't want a word coming in, you can use the forbidden feature (as m_data could be -- see above).

  • 11

    CI: build binaries for linux and macOS

    Checking if we can build binaries for Linux (amd64) and macOS (amd64 and arm64).

    Since LS uses CGo for the LMDB bindings, cross compiling is more complicated. Perhaps we can use docker's support for running in qemu?

    The macOS ones may be of limited value, because they are not signed. Running them may be more of a hassle than compiling them yourself for most users. Perhaps add to homebrew instead?

  • 12

    Change mod import path to github or valid path

    Change the powerdns.com/platform/lightningstream import path to github.com/PowerDNS/lightningstream so that go install and https://pkg.go.dev/github.com/PowerDNS/lightningstream work correctly.

    The alternative would be to add these url endpoints with special html meta tags on our website and make sure it always works, which is probably not worth the trouble.