Algorand Conduit
Conduit is a framework for ingesting blocks from the Algorand blockchain into external applications. It is designed as modular plugin system that allows users to configure their own data pipelines for filtering, aggregation, and storage of blockchain data.
For example, use conduit to:
- Build a notification system for on chain events.
- Power a next generation block explorer.
- Select app specific data and write it to a custom database.
- Build a custom Indexer for a new ARC.
- Send blockchain data to another streaming data platform for additional processing (e.g. RabbitMQ, Kafka, ZeroMQ).
- Build an NFT catalog based on different standards.
Getting Started
Installation
Download
The latest conduit
binary can be downloaded from the GitHub releases page.
Docker
The latest docker image is on docker hub.
Install from Source
- Checkout the repo, or download the source,
git clone https://github.com/algorand/conduit.git && cd conduit
- Run
make conduit
. - The binary is created at
cmd/conduit/conduit
.
Usage
Conduit is configured with a YAML file named conduit.yml
. This file defines the pipeline behavior by enabling and configuring different plugins.
conduit.yml
configuration file
Create Use the conduit init
subcommand to create a configuration template. Place the configuration template in a new data directory. By convention the directory is named data
and is referred to as the data directory.
mkdir data
./conduit init > data/conduit.yml
A Conduit pipeline is composed of 3 components, Importers, Processors, and Exporters. Every pipeline must define exactly 1 Importer, exactly 1 Exporter, and can optionally define a series of 0 or more Processors. See a full list of available plugins with conduit list
or the plugin documentation page.
Here is an example conduit.yml
that configures two plugins:
importer:
name: algod
config:
mode: "follower"
netaddr: "http://your-follower-node:1234"
token: "your API token"
# no processors defined for this configuration
processors:
exporter:
name: "file_writer"
config:
# the default config writes block data to the data directory.
The conduit init
command can also be used to select which plugins to include in the template. The example below uses the standard algod importer and sends the data to PostgreSQL. This example does not use any processor plugins.
./conduit init --importer algod --exporter postgresql > data/conduit.yml
Before running Conduit you need to review and modify conduit.yml
according to your environment.
Run Conduit
Once configured, start Conduit with your data directory as an argument:
./conduit -d data
Full Tutorials
External Plugins
Conduit supports external plugins which can be developed by anyone.
For a list of available plugins and instructions on how to use them, see the External Plugins page.
External Plugin Development
See the Plugin Development page for building a plugin.
Contributing
Contributions are welcome! Please refer to our CONTRIBUTING document for general contribution guidelines.
Migrating from Indexer 2.x
Conduit can be used to populate data from an existing Indexer 2.x deployment as part of upgrading to Indexer 3.x.
We will continue to maintain Indexer 2.x for the time being, but encourage users to move to Conduit. It provides cost benefits for most deployments in addition to greater flexibility.
To migrate, follow the Using Conduit to Populate an Indexer Database tutorial. When you get to the step about setting up postgres, substitute your existing database connection string. Conduit will read the database to initialize the next round.
PostgreSQL exporter delete-task does not seem to be working
The PostgreSQL exporter
delete-task
option does not seem to be workingWe were hoping to use the
delete-task
option to keep our dev databases a bit smaller. Ourconduit.yml
file looks like this:However, when running against either a fully-synced PostgreSQL instance or an instance that is starting from 0, it does not appear that the
txn
table is being culled. Using the above configs, I can see the following in the DB:I'd have expected the early rows to have been thrown out as conduit moves through later blocks.
Your environment
Steps to reproduce
Expected behaviour
Only transactions within the last 1000 blocks would be present
Actual behaviour
All transactions appear to be present. Table continues to get longer as conduit works its way through the ledger.
Conduit will not reconnect/continue if algod has been restarted
Conduit will not reconnect/continue if algod has been restarted
We have noticed that if we restart our algod instances, conduit gets stuck and will not ingest new blocks once algod has finished restarting and is back on-line
Your environment
Software Versions:
algod 3.16.2
conduit 1.2.0
Pipeline configuration
algod config.json:
conduit.yaml:
There is one alogd instance, fronted by a kubernetes service object. The network path looks like:
Conduit ---> k8s service ---> algod
When the algod pod is killed, it takes about 30-60s before it is re-registered in the k8s service and available. The IP addresses do not change.
The Algod pod has a PVC (disk) attached and this same disk is re-attached to the recreated algod pod every time. So algod is starting back up with exactly the same state as when it was shut down.
We have seen this using two ledgers now, Betanet and MainNet.
Steps to reproduce
kubectl delete pod <algod_pod_name>
Expected behaviour
Conduit should fetch the next block from its dedicated algod instance as soon as it has restarted.
Actual behaviour
Conduit gets stuck trying to retrieve the next block. Restarting conduit or restarting algod a second time does not help.
We see the following logs in conduit:
In the same time frame, we see the following logs from algod:
These same log stanzas just repeat over and over.
conduit indexer format
Hi! We started to use algorand/conduit with algorand/indexer and found undesirable behavior: $ curl 0:8980/v2/blocks/28992333 {"message":"error while looking up block for round '28992333': json decode error [pos 729]: failed to decode address v+gWVeweiq9PYHJc2Wh+C8najmt9AMQQHqsCDEgPKnk= to base 32"}
It seems that happens because conduit writes values in base64 format to the database, whereas indexer wrote and expects this field in base32 format. Are you aware of this problem? Will next release of conduit fix this?
lint: Fix the `lint` github workflow
Summary
Fix reviewdog linter in the github PR workflow.
Before this, reviewdog silently caused a panic, but the workflow succeeded anyway. This change pins the reviewdog step to explicitly use Go 1.17.13 (uses latest 1.20.* version by default, which was causing a panic) and sets to check with
nofilter
(only checksadded
lines of code by default).Test Plan
Locally run (requires act):
act pull_request -j lint
PR
lint
workflow fails when there is a golangci lint error: https://github.com/algorand/conduit/actions/runs/4984363520/jobs/8922595805?pr=76Closes https://github.com/algorand/conduit/issues/75
algod importer: Update sync on WaitForBlock error.
Summary
If algod is restarted after it receives a sync round update but before it fetches the new round(s), then the algod follower and conduit will stall. Conduit will keep waiting for algod to reach the new sync round but it never happens.
This change adds some extra logic to the
WaitForBlock
call. If there is a timeout or a bad response, a new attempt to set the sync round is made.This PR also removes the retry loop from the algod importer. Retry is now managed by the pipeline.
Test Plan
Update existing unit tests.
Fix typo on conduit configuration file
Summary
Fixes typo on Docker documentation. The entry point is using
cp /etc/algorand/conduit.yml /data/conduit.yml
and the documentation refers toconfig.yml
Test Plan
I didn't test the changes, I think they can be validated by inspection.
Bug-Fix: don't attempt to get deltas at round 0
Summary
Don't attempt to query deltas from a follower node for round 0. Instead, just declare that we need to catch up and log:
Test Plan
CI - introduced new unit test cases
refactoring: Cleanup dependencies, port indexer utils, delete dead code.
Summary
Cleanup code prior to breaking out plugin dependencies:
Test Plan
N/A - non functional changes.
docker: Add docker to the release pipeline
Summary
Add a docker multi-arch build + deployment configuration.
~For now the docker hub deployment is disabled.~ Hyperflow has added credentials to the repo so that the container can be deployed.
Goreleaser provides binaries from the build step to the Dockerfile to ensure containers have the same files as the archives.
Test Plan
Tested with
goreleaser release --skip-publish --snapshot --clean
The files are here:
The architecture appears as expected:
docs: automation to publish documentation after a release.
Summary
Two parts of this PR:
Automation
Implemented in a new
Doc Repo PR Generator / docs-pr
workflow. It is designed to run manually, and to run after thegoreleaser
action.The following are updated:
reformat.py
in the docs repo.docs
directory into the docs repodocs/get-details/conduit
directory.A PR is created on the
algorand/docs
repo with a number of reviewers tagged automatically. I grabbed the list from other GH actions, maybe a group could be created in the future.remote repository / security notes
In order to create a PR on a remote repo, some special access is required. I followed the steps documented here and created a Personal Access Token (PAT). ~The token is a classic token, because I was not able to create a fine-grained token targeting repositories in the
algorand
org.~ The token is a fine-grained token limited to thealgorand/docs
repo.Test Plan
In order to gain access to the
${{ secrets.PAT }}
I temporarily changed theon
condition topull_request
, and pushed a branch directly toalgorand/conduit
. After getting the action to run I see the PR created: https://github.com/algorand/docs/pull/1064Open Issue
The PR seems to be created by me. I tried to override the committer but I must have missed something. Also the CLA Assistant is not working. It looks like the commit is correctly changed to "algo-devops-service" but other parts are not. Maybe it's due to the personal access token?
docs: Fix dynamically changing logo and convert to PNG
Summary
GitHub seems to have issues with svgs stored in repository. Changing the assets to pngs with hosted url fixes the issue and the logo changes dynamically to Black for Light theme and White for Dark theme.
Test Plan
How did you test these changes? Please provide the exact scenarios you tested in as much detail as possible including commands, output and rationale.
Prometheus Metrics for Consideration
Catchall for prometheus metrics we may want to add
Design: Startup an Indexer automatically
Problem
When we want to run an Indexer with Conduit, we need to first start the Conduit and then the Indexer in separate startup commands. For a simple single node indexer deployment, it would be better if the indexer API is available when running Conduit.
Solution
Propose a solution to run the Indexer API as part of Conduit.
Log Rotation: document or provide solution
Problem
After a catchup on mainnet with log level
INFO
, the conduit log is more than 12GB. This is large enough that some sort of log rotation strategy is called for.Solution
We could either provide our own solution using a go library, or document how this could be done on various system. For example, on linux one could use logrotate.
An argument for providing our own solution is platform independence, but a counter-argument is that this is non-standard.
Dependencies
Urgency
Low - I'm not sure how big of an issue this in the community
plugin: Creator account filter.
Problem
It is common to have a set of creators who create many assets and/or applications, which are then delegated to other accounts. In this scenario the creator is only relevant at create time. After they have been delegated there are no longer references to the creator accounts.
Suppose a creator has made 10,000 ASAs and you want to track the usage of each of them, the current filter plugin would not allow it because the list of ASAs would vary over time as the creator continues to add new ones.
Solution
A new plugin with the following properties:
During startup an in-memory list of IDs can be resolved from Indexer or algod. While processing blocks the list of IDs would grow or shrink as the creator creates or deletes things. Transactions associated with any of the IDs (or creators) are selected and all others are removed.
Note: instead of providing algod or indexer, the list of IDs can be stored in the data directory.
Replace `sirupsen/logrus` with a more performant logger
Problem
In working on #128 , I ran some performance tests and noticed that log-level had a significant impact. logrus has been observed to be slow in go-algorand and there is an internal issue (2479) to replace it and a poc branch using zerolog.
Problematic experiment
Using the Justfile command
to bootstrap testnet and run a postgresql exporter against it for 300 seconds. I ran it a number of times against both the original pipeline and the new one. Here are the experimental results:
| Log Level | Reps | Original rounds/300 sec (logs/round) | Pipelining rounds/300 sec (logs/round) | Pipelining v Original (%) | |-----------|------|-----|--------------------------|---------------------------| | TRACE | 3 | 3718 (7.0) | 3509 (14.0) | -5.6% | | INFO | 2 | 4578.5 (3.0) | 4423.5 (3.0) | -3.4% |
So comparing the results within each column we can see:
The sample was very noisy but it looks like each log per round is costing around 1-5 % hit in terms of performance.
Action Items
More links
Dependencies
None
Urgency
Medium - as we're currently working on improving Conduit's performance, this seems like a useful avenue to persue.
Pipeline the Pipeline
Description
Allowing for moderate concurrency in the pipeline but without sacrificing its sequential integrity.
Summary of Changes
conduit/pipeline/common.go
: introducing generic retrying of pipeline methods viaRetries()
andRetriesNoOutput()
conduit/pipeline/pipeline.go
:WhyStopped()
methodjoinError()
instead ofsetError()
to the pipeline's error propertyStart()
and introducing methodsImportHandler()
,ProcessorHandler()
, andExporterHandler()
conduit/pipeline/pipeline_bench_test.go
- benchmarker for the pipeline that includes sleeping plugins with an importer, 2 processors, and an exporter.pkg/cli/cli.go
: remove a line break from the final error printoutIssues
#118
TODO
PipelineData
inside ofBlockData
OStart()
and any resulting orphans inpipeline.go
master
which includes 30 second timeout #133WhyStopped()
?~ NOT FOR NOW. Reconsider this for #97internal-tools/.../logstats.go
be modified?~ This is under the umbrella of the "Logging Plugin Performance" threadTesting
E2E
pipeline_bench_test.go
Running a new benchmark test twice on the original code and the new, we have the following results. Note the most pertinent results for the typical indexer DB population use case is
exporter_10ms_while_others_1ms
:| Benchmark Name | Original rounds/sec | Pipelining rounds/sec | Pipelining v Original (%) | |----------------------------------------------------------|--------------|----------------|---------------------------| | vanilla_2_procs_without_sleep-size-1-8 | 3077 | 3309.5 | +7% | | uniform_sleep_of_10ms-size-1-8 | 22.32 | 79.815 | +250% | | exporter_10ms_while_others_1ms-size-1-8 | 63.405 | 78.565 | +24% | | importer_10ms_while_others_1ms-size-1-8 | 65.535 | 91.255 | +39% | | first_processor_10ms_while_others_1ms-size-1-8 | 60.28 | 89.175 | +48% |
Block Generator Results
Running the block generator test using
SCENARIO = scenarios/config.allmixed.small.yml
for 30s, with the original code and the new, each time for 2 experiments we have:| Reset database? | Original rounds/30 sec | Pipelining rounds/30 sec | Pipelining v Original (%) | |--------------------|------------------------|--------------------------|---------------------------| | Reset | 301 | 400 | +33% | | No Reset | 295 | 418 | +41% |
Local test network 5 minute sprint
I used the Justfile command
to bootstrap testnet and run a postgresql exporter against it for 300 seconds. I ran it a number of times against both the original pipeline and the new one. Here are the experimental results:
| Log Level | Reps | Original rounds/300 sec (logs/round) | Pipelining rounds/300 sec (logs/round) | Pipelining v Original (%) | |-----------|------|-----|--------------------------|---------------------------| | TRACE | 3 | 3718 (7.0) | 3509 (14.0) | -5.6% 😢 | | INFO | 2 | 4578.5 (3.0) | 4423.5 (3.0) | -3.4% 😢 |
On EC2 - CLASSIC vs. PIPELINING
There are much more detailed results in a google sheets document, but we have:
SUMMARY
#133 aims to reduce the post-catchup fatal errors to 0