Buz
Event Collection, Validation, and Delivery.
Buz is a system for collecting events from various sources, validating data quality, and delivering them to where they need to bee.
Quickstart
Quickstart documentation for setting up an end-to-end streaming analytics stack with Buz, Redpanda, Materialize, and Kowl can be found here.
Documentation
Documentation can be found here.
Short-circuit and return 400 when snowplow requests are invalid
Right now, sometimes sending unexpected payloads can cause 500 responses with panics:
We should consider building in some nice mechanisms for graceful failures and helpful responses, where applicable.
I'm not sure what about this payload in particular caused a panic.
Snowplow example not working.
Perhaps I misunderstood this section: https://github.com/silverton-io/honeypot/blob/main/website/docs/examples/quickstart.md#4-send-events-to-honeypot
but it indicates that the page it serves up at 8080 has snowplow integrated for easy testing. I don't think that's the case... I just get
404 page not found
and I also get 404s when trying toPOST
orGET
anything from localhost or where I have this served up in GCP right now. Not sure if I'm doing something wrong...Add configurable sink-level retries
Right now sinks do not retry if they fail. Having that level of guarantee is important.
Sinks should be independently configurable to retry yes/no, and it would be cool to configure the retry strategy. Exponential decay with min/max? Something else?
Make application secrets more flexible
Currently there's one way of providing secrets to honeypot - a file titled
config.yml
in the same directory as the binary. This is obviously non-ideal.Options (probably all of them are ideal)
LEVEL1_LEVEL2_VARNAME
Add tf outputs, add system/env flexibility
This PR:
terraform
directory underdeploy
system
andenv
vars to multiple environments can be rapidly provisioned using the same codemain
tolocals
391: GCP Terraform
Issue 391
Terraform for the GCP deploy as outlined by the GCP documentation steps. First ever Terraform for GCP so may not be following all the best standards. Main hurdle is that it has to be applied in two steps because of the manual process of uploading the image. Otherwise seems to work.
Make sink config simpler
It would be cool if this thing just split events into tables without being configured to do so. But could be configured in reverse to say "no don't do that".
Create swagger documentation for endpoints
buz endpoints like
/stats
,/schemas
etc right now are kind of 'need to know', but ideally we have an auto generated documentation for them with descriptions of what they do and what we can expect back when we hit them.