A custom IPFS/Filecoin node that makes it easy to pin IPFS content and make Filecoin deals.

  • By Application Research
  • Last update: Dec 14, 2022
  • Comments: 14

Estuary

An experimental ipfs node

Building

Requirements:

  • go (1.15 or higher)
  • jq
  • hwloc
  • opencl
  1. Run make clean all inside the estuary directory

Running

To run locally in a 'dev' environment, first run:

./estuary setup

Save the auth token that this outputs, you will need it for interacting with and controlling the node.

NOTE: if you want to use a different database than a sqlite instance stored in your local directory, you will need to configure that with the --database flag, passed before the setup command: ./estuary --database=XXXXX setup

Once you have the setup complete, choose an appropriate directory for estuary to keep its data, and use it as your datadir flag when running estuary. You will also need to tell estuary where it can access a lotus gateway api, we recommend using:

export FULLNODE_API_INFO=wss://api.chain.love

Then run:

./estuary --datadir=/path/to/storage --database=IF-YOU-NEED-THIS --logging

Systemd

The Makefile has a target that will install a generic but workable systemd service for estuary.

Run make install-estuary-service on the machine you wish to run estuary on.

Make sure to follow the instructions output by the make command as configuration is required before the service can run succesfully.

Download

estuary.zip

Comments(14)

  • 1

    Seeking Intructions on How to Run My Own Node

    This project looks awesome! Really excited to see a Filecoin interactive project with a fleshed out interface. Great work! The only issue is that all the documentation assumes that the user is going to apply for an API Key which is totally fine, but what if I wanted to run my own node? (Aka make it truly decentralized)? Is there instructions on how to do this because I can't find any.

    Again just to be clear, I'm looking for a way to run this project so I don't need to be granted an API key (other than the key that I'm granting myself in the backend of my server running an Estuary node). I feel like that would make it a truly permission-less way to interact with Filecoin.

    Update 08/23/21
    So I tried to install this repo locally to see if I could get it to run without any external node running. I ran git clone https://github.com/application-research/estuary.git, cd estuary/, make, then make install. I then ran estuary as it was now in my path but It gave me this message could not get API info: could not get api endpoint: API not running (no endpoint). Where do I go from here?

  • 2

    Add /collections/{coluuid}/content/{contentid} [delete] endpoint

    This PR adds the endpoint to remove a content from a collection, also refactors some of the code so it's more reusable.

    TODO:

    • [x] Create GetContent function
      • [x] Create contents package
    • [x] Can one content be twice in a collection? if so we need the path specified (YES)
      • [x] Can we have the same content on the same collection and on the same path? (NO)
        • [x] Specify path on delete endpoint

    Fixes #421

    Verification:

    + APIKEY=EST599c28b4-fd89-4b44-b579-74af0997a5ecARY
    + path1=/my/path/1
    + path2=/my/path/2
    + curl -X POST -H Authorization: Bearer EST599c28b4-fd89-4b44-b579-74af0997a5ecARY http://localhost:3004/collections/create -d { "name": "A new collection", "description": "A new collection test" } -s
    
    + coluuid=5b73a91a-0dfc-4f6b-94d1-aad8c9881366
    + echo coluuid: 5b73a91a-0dfc-4f6b-94d1-aad8c9881366
    coluuid: 5b73a91a-0dfc-4f6b-94d1-aad8c9881366
    + curl -X POST -H Authorization: Bearer EST599c28b4-fd89-4b44-b579-74af0997a5ecARY http://localhost:3004/content/add?coluuid=5b73a91a-0dfc-4f6b-94d1-aad8c9881366&dir=/my/path/1 -F+  [email protected] -s
    
    + file1=58
    + echo added file1 (58) to estuary
    added file1 (58) to estuary
    + curl -X POST -H Authorization: Bearer EST599c28b4-fd89-4b44-b579-74af0997a5ecARY http://localhost:3004/content/add?coluuid=5b73a91a-0dfc-4f6b-94d1-aad8c9881366&dir=/my/path/2 -F [email protected] -s
    
    + file2=59
    + echo added file2 (59) to estuary
    added file2 (59) to estuary
    + curl -X GET -H Authorization: Bearer EST599c28b4-fd89-4b44-b579-74af0997a5ecARY http://localhost:3004/collections/content?coluuid=5b73a91a-0dfc-4f6b-94d1-aad8c9881366 -s
    + jq
    [
      {
        "id": 58,
        "updatedAt": "2022-09-14T18:40:21.279946787Z",
        "cid": "bafybeiezx5rydlqhyqzlqnc7zz7eawqowyzfjoon6b43dbkvws5rop3ivy",
        "name": "estuary",
        "userId": 1,
        "description": "",
        "size": 193993620,
        "type": 0,
        "active": true,
        "offloaded": false,
        "replication": 6,
        "aggregatedIn": 0,
        "aggregate": false,
        "pinning": false,
        "pinMeta": "",
        "replace": false,
        "origins": "",
        "failed": false,
        "location": "local",
        "dagSplit": false,
        "splitFrom": 0,
        "path": "/my/path/1/estuary"
      },
      {
        "id": 59,
        "updatedAt": "2022-09-14T18:40:22.910605199Z",
        "cid": "bafybeiezx5rydlqhyqzlqnc7zz7eawqowyzfjoon6b43dbkvws5rop3ivy",
        "name": "estuary",
        "userId": 1,
        "description": "",
        "size": 193993620,
        "type": 0,
        "active": true,
        "offloaded": false,
        "replication": 6,
        "aggregatedIn": 0,
        "aggregate": false,
        "pinning": false,
        "pinMeta": "",
        "replace": false,
        "origins": "",
        "failed": false,
        "location": "local",
        "dagSplit": false,
        "splitFrom": 0,
        "path": "/my/path/2/estuary"
      }
    ]
    + curl -X DELETE -H Authorization: Bearer EST599c28b4-fd89-4b44-b579-74af0997a5ecARY http://localhost:3004/collections/5b73a91a-0dfc-4f6b-94d1-aad8c9881366/content/58 -s
    + echo deleted file1 (58) to estuary
    deleted file1 (58) to estuary
    + curl -X GET -H+  Authorization: Bearer EST599c28b4-fd89-4b44-b579-74af0997a5ecARY http://localhost:3004/collections/content?coluuid=5b73a91a-0dfc-4f6b-94d1-aad8c9881366 -s
    jq
    [
      {
        "id": 59,
        "updatedAt": "2022-09-14T18:40:22.910605199Z",
        "cid": "bafybeiezx5rydlqhyqzlqnc7zz7eawqowyzfjoon6b43dbkvws5rop3ivy",
        "name": "estuary",
        "userId": 1,
        "description": "",
        "size": 193993620,
        "type": 0,
        "active": true,
        "offloaded": false,
        "replication": 6,
        "aggregatedIn": 0,
        "aggregate": false,
        "pinning": false,
        "pinMeta": "",
        "replace": false,
        "origins": "",
        "failed": false,
        "location": "local",
        "dagSplit": false,
        "splitFrom": 0,
        "path": "/my/path/2/estuary"
      }
    ]
    
  • 3

    boost integration

    This PR modifies Estuary to be able to make deals with Boost

    Boost listens on v1.2.0 of the deal proposal protocol. So to make deals with Boost, Estuary needs to

    • Connect to the Storage Provider
    • Check if the SP supports deal proposal protocol v1.2.0
    • Create an auth token and tell the shuttle with the data for the deal to accept a request with that auth token
    • Send a deal proposal to the SP running Boost, including the auth token and download address
    • Boost then downloads the data from the shuttle

    The new version of the deal proposal protocol also needs the size of the CAR file (ie, the size of data that will be transferred across the wire). So there are some changes to calculate the CAR file size of deal data before making a deal proposal.

    TODO:

    • [ ] The estuary primary node gets the announce address from the shuttle. For boost we need the announce address to be public. If the announce address is public does that mean that calls from the primary node to the shuttle with go through the internet (instead of local network)?

    Depends on https://github.com/application-research/filclient/pull/60

  • 4

    Chore/extract api package

    This PR moves all API logic into an apiV1 package and sets the ground for API v2 work which is apiv2 both in the api directory. It is also a step toward improving the code for readability, testing, and identifying service boundaries (which will help us build a better deterministic platform).

    verification

    • I have tested content/add, content/add-car, pinnging/pin, staging-zones, and deal-making, shuttles - all works

    Please have a look and provide feedback

  • 5

    `/add-car` don't support uploading without `Content-Length` set

    Describe the bug Trying to upload files to estuary using linux2ipfs it fails (where it didn't, this broke in a 1 week window arround the end of IPFS-Thing).

    This is because linux2ipfs don't set the Content-Length header.

    This is undocumented behavior of the API and the error message is absolutely terrible, the shuttle send a 500 with this body:

    <html>
    <head><title>500 Internal Server Error</title></head>
    <body>
    <center><h1>500 Internal Server Error</h1></center>
    <hr><center>nginx/1.18.0 (Ubuntu)</center>
    </body>
    </html>
    

    To Reproduce Just POST a car file to /content/add-car to a shuttle WITHOUT Content-Length. (note be carefull curl adds Content-Length automagically for you by default).

    Expected behavior I prefer the old behaviour where the file would just be streamed until EOF.

    Actual behavior If Content-Length is missing it fails extremely unclearly.

    Additional context I understand that for some technical reason you might need Content-Length (like preallocating some space or whatever). However I would like this to be documented in the HTTP API docs as well have a meaning full error message (like Error: Content-Length required).


    I added a "work-around" in linux2ipfs where I just set the content length: https://github.com/Jorropo/linux2ipfs/commit/517015a5506b4c7b429ceb4b2cbd847406c1ea12

  • 6

    Ignore trailing slash on API endpoints

    Closes #438

    This change will strip the trailing slash from any incoming request. The caller can either make a request with or without it (ex, http://localhost:3004/health or http://localhost:3004/health/), both will work now.

  • 7

    docs(readme): Add clarification of first user credentials

    The problem

    Users via the front-end when providing credentials are using a predefined salt defined here: https://github.com/application-research/estuary-www/blob/master/common/constants.ts#L28 for their password. The resulting hash is then safely sent to authenticate. This goes for signing-up/logging-in.

    Now when creating the first user via:

    ./estuary setup --username=<uname> --password=<pword> --database=XXXXX

    The provided password is saved by salting it with a randomly generated uuid not the salt provided above. Therefore when trying to authenticate using this same credentials, the resulting hash will not match and the first user is not able to authenticate. The user can use a workaround by using the token and logging in with it. However it might be confusing to see the provided credentials are not working for first time users in estuary-www.

    The Fix

    It could be possible to salt the first user's password in the backend with the same salt, not sure if maintaining a copy of the same salt (in estuary-www and estuary) desired, since at some point it might diverge.

    However, providing some documentation provided in this PR will at least avoid confusion.

  • 8

    Document content/add response structure

    I've added the structure to the docs.go file, but I'm not sure how the JSON and Swagger are supposed to get generated.

    Is this the correct approach?

    I might submit one for add-car in a bit too.

  • 9

    "too many open files" error when runniing estuary

    I'm trying to run estuary and repeatedly get the error "failed to resolve local interface addresses ... too many open files".

    I've tried increasing the ulimit from 1024 to 10000 (ulimit -Hn 10000 && ulimit -Sn 10000) but get the same error on Ubuntu 20.04.

    [email protected]:~/estuary$ ./estuary --datadir=/home/argosopentech/estuary-data --logging
    Wallet address is:  f12ozido5i7idkqv7pogsgxpt7rswxkgl5ur5guki
    2022/03/19 14:46:13 failed to sufficiently increase receive buffer size (was: 208 kiB, wanted: 2048 kiB, got: 416 kiB). See https://github.com/lucas-clemente/quic-go/wiki/UDP-Receive-Buffer-Size for details.
    2022-03-19T14:46:13.942Z        INFO    dt-impl impl/impl.go:145        start data-transfer module
    /ip4/164.92.159.150/tcp/6744/p2p/12D3KooWJchfMcLrpVmLjX8wCy4SunjjncfVf1ipMotLiBnCN2o1
    /ip4/127.0.0.1/tcp/6744/p2p/12D3KooWJchfMcLrpVmLjX8wCy4SunjjncfVf1ipMotLiBnCN2o1
    2022-03-19T14:46:13.947Z        INFO    estuary estuary/replication.go:719      queueing all content for checking: 0
    
       ____    __
      / __/___/ /  ___
     / _// __/ _ \/ _ \
    /___/\__/_//_/\___/ v4.6.1
    High performance, minimalist Go web framework
    https://echo.labstack.com
    ____________________________________O/_______
                                        O\
    ⇨ http server started on [::]:3004
    2022-03-19T14:46:58.507Z        ERROR   basichost       basic/basic_host.go:327 failed to resolve local interface addresses     {"error": "route ip+net: netlinkrib: too many open files"}
    2022-03-19T14:47:03.507Z        ERROR   basichost       basic/basic_host.go:327 failed to resolve local interface addresses     {"error": "route ip+net: netlinkrib: too many open files"}
    2022-03-19T14:47:08.508Z        ERROR   basichost       basic/basic_host.go:327 failed to resolve local interface addresses     {"error": "route ip+net: netlinkrib: too many open files"}
    
  • 10

    verify data stored on local estuary setup

    I need help verifying data stored on my local estuary setup.

    Steps:

    • setup estuary
    • setup shuttle
    • setup estuary-web
    • Add file via estuary endpoint

    when verifying the cid, i get a message "This CID is not found. It might be pinned by a IPFS Node, you can use the dweb.link URL to check"

    the content is accessible at the IPFS gateway, https://bafkqactumvzxictumvzxicq.ipfs.dweb.link/

    I would like to verify that estuary node is pinning the data and filecoin deals are made.

    Thanks

  • 11

    Clients need two endpoints

    Describe the bug Currently because api uses api.estuary.tech whereas upload uses upload.estuary.tech clients need to construct two separate endpoints in their code.

    See https://github.com/filecoin-project/bacalhau/blob/9201087604398cbc1a458717cfca6f50dc22c976/pkg/publisher/estuary/endpoints.go#L19-L20

    this is unnecessary overhead for our users. We should make the UX simpler by providing a single frontdoor.

    @simonwo for awareness.

  • 12

    feat: sp endpoints v2 (Revision B)

    Closes #578

    This is revision B as the original PR was reverted, since it broke V1. (https://github.com/application-research/estuary/pull/831)

    In this new PR, V1 has been left completely untouched to avoid anything breaking.

  • 13

    Bump gorm.io/driver/sqlite from 1.1.5 to 1.4.4

    Bumps gorm.io/driver/sqlite from 1.1.5 to 1.4.4.

    Commits

    Dependabot compatibility score

    Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


    Dependabot commands and options

    You can trigger Dependabot actions by commenting on this PR:

    • @dependabot rebase will rebase this PR
    • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
    • @dependabot merge will merge this PR after your CI passes on it
    • @dependabot squash and merge will squash and merge this PR after your CI passes on it
    • @dependabot cancel merge will cancel a previously requested merge and block automerging
    • @dependabot reopen will reopen this PR if it is closed
    • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
    • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
    • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
  • 14

    GET /collections/{col-uuid} Excruciatingly Slow

    Describe the bug Hi there, when I'm running queries to get contents of a specific path of a collection in the Estuary DB, I get highly varied and extremely slow response times ranging from 1 to 15 seconds.

    To Reproduce

            const url = dir ? `https://api.estuary.tech/collections/${col_id}?dir=${dir}`
                            : `https://api.estuary.tech/collections/${col_id}`;
            const start = Date.now();
            const result = await fetch(url, {
                method: 'GET',
                headers: {
                    "Authorization": "Bearer " + process.env.ESTUARY_API_KEY,
                }
            });
            const data = await result.json();
            const end = Date.now();
            console.log(`GET ${url} took ${end - start}ms`);
    
    GET https://api.estuary.tech/collections/5bf508d7-34c0-4ef7-8853-6dda75be491c?dir=/ took 853ms
    GET https://api.estuary.tech/collections/5bf508d7-34c0-4ef7-8853-6dda75be491c?dir=/ took 1395ms
    GET https://api.estuary.tech/collections/5bf508d7-34c0-4ef7-8853-6dda75be491c?dir=/ took 8364ms
    GET https://api.estuary.tech/collections/5bf508d7-34c0-4ef7-8853-6dda75be491c?dir=/ took 3951ms
    GET https://api.estuary.tech/collections/5bf508d7-34c0-4ef7-8853-6dda75be491c?dir=/ took 13101ms
    

    Additional context

    I see that in the following file there is a TODO which seems to imply that there is an unoptimized query being run to fetch the content of the collection in handlers.go

    	// TODO: optimize this a good deal
    	var refs []util.ContentWithPath
    	if err := s.DB.Model(collections.CollectionRef{}).
    		Where("collection = ?", col.ID).
    		Joins("left join contents on contents.id = collection_refs.content").
    		Select("contents.*, collection_refs.path as path").
    		Scan(&refs).Error; err != nil {
    		return err
    	}
    

    I was wondering if it was as simple as improving the query or if the underlying models need to be changed to make this lookup more efficient? I don't know enough about Estuary's underlying system to try to fix this but if anyone has some insight that would be amazing.

    Thanks.