graph is a generic library for creating graph data structures and performing operations on them.

  • By Dominik Braun
  • Last update: Dec 30, 2022
  • Comments: 16

dominikbraun/graph

graph is a generic library for creating graph data structures and performing operations on them. It supports different kinds of graphs such as directed graphs, acyclic graphs, or trees.

Features

  • Vertices of any data type, such as int or City.
  • Optionally combinable graph types and traits.
  • Validations considering the graph type, such as cycle detection in acyclic graphs.
  • Determination of graph and vertex properties, such as degree or tree-depth.
  • Non-recursive walks, DFS, and BFS.
  • Pathfinding algorithms, considering edge weights where appropriate:
    • Hamiltonian paths and cycles
    • Eulerian paths and cycles
    • Shortest path (Dijkstra)
    • Strongly connected components (Tarjan)

Status: Work in progress. Multigraphs aren't supported.

Getting started

go get github.com/dominikbraun/graph

Quick examples

Create a graph of integers

graph of integers

g := graph.New(graph.IntHash)

g.Vertex(1)
g.Vertex(2)
g.Vertex(3)
g.Vertex(4)
g.Vertex(5)

_ = g.Edge(1, 2)
_ = g.Edge(1, 4)
_ = g.Edge(2, 3)
_ = g.Edge(2, 4)
_ = g.Edge(2, 5)
_ = g.Edge(3, 5)

Create a directed acyclic graph of integers

directed acyclic graph

g := graph.New(graph.IntHash, graph.Directed(), graph.Acyclic())

g.Vertex(1)
g.Vertex(2)
g.Vertex(3)
g.Vertex(4)

_ = g.Edge(1, 2)
_ = g.Edge(1, 3)
_ = g.Edge(2, 3)
_ = g.Edge(2, 4)
_ = g.Edge(3, 4)

Create a graph of a custom type

To understand this example in detail, see the concept of hashes.

type City struct {
    Name string
}

cityHash := func(c City) string {
    return c.Name
}

g := graph.New(cityHash)

g.Vertex(london)

Create a weighted graph

weighted graph

g := graph.New(cityHash, graph.Weighted())

g.Vertex(london)
g.Vertex(munich)
g.Vertex(paris)
g.Vertex(madrid)

_ = g.WeightedEdge(london, munich, 3)
_ = g.WeightedEdge(london, paris, 2)
_ = g.WeightedEdge(london, madrid, 5)
_ = g.WeightedEdge(munich, madrid, 6)
_ = g.WeightedEdge(munich, paris, 2)
_ = g.WeightedEdge(paris, madrid, 4)

Perform a Depth-First Search

This example traverses and prints all vertices in the graph in DFS order.

depth-first search

g := graph.New(graph.IntHash, graph.Directed())

g.Vertex(1)
g.Vertex(2)
g.Vertex(3)
g.Vertex(4)

_ = g.Edge(1, 2)
_ = g.Edge(1, 3)
_ = g.Edge(3, 4)

_ = g.DFS(1, func(value int) bool {
    fmt.Println(value)
    return false
})
1 3 4 2

Find strongly connected components

strongly connected components

g := graph.New(graph.IntHash)

// Add vertices and edges ...

scc, _ := g.StronglyConnectedComponents()

fmt.Println(scc)
[[1 2 5] [3 4 8] [6 7]]

Find the shortest path

shortest path algorithm

g := graph.New(graph.StringHash, graph.Weighted())

// Add vertices and weighted edges ...

path, _ := g.ShortestPath("A", "B")

fmt.Println(path)
[A C E B]

Cycle checks for acyclic graphs

cycle checks

g := graph.New(graph.IntHash, graph.Acyclic())

g.Vertex(1)
g.Vertex(2)
g.Vertex(3)

_ = g.Edge(1, 2)
_ = g.Edge(1, 3)

if err := g.Edge(2, 3); err != nil {
    panic(err)
}
panic: an edge between 2 and 3 would introduce a cycle

Concepts

Hashes

A graph consists of nodes (or vertices) of type T, which are identified by a hash value of type K. The hash value is obtained using the hashing function passed to graph.New.

Primitive types

For primitive types such as string or int, you may use a predefined hashing function such as graph.IntHash – a function that takes an integer and uses it as a hash value at the same time:

g := graph.New(graph.IntHash)

This also means that only one vertex with a value like 5 can exist in the graph if graph.IntHash used.

Custom types

For storing custom data types, you need to provide your own hashing function. This example function takes a City and returns the city name as an unique hash value:

cityHash := func(c City) string {
    return c.Name
}

Creating a graph using this hashing function will yield a graph with vertices of type City identified by hash values of type string.

g := graph.New(cityHash)

Traits

The behavior of a graph, for example when adding or retrieving edges, depends on its traits. You can set the graph's traits using the functional options provided by this library:

g := graph.New(graph.IntHash, graph.Directed(), graph.Weighted())

Documentation

The full documentation is available at pkg.go.dev.

Download

graph.zip

Comments(16)

  • 1

    Proposal: Decouple vertice/edge store from graphs

    Hey there,

    Let me start off by saying a big thank you for releasing this as open source. I really like the use of hashes for identifying vertices as it allowes for more flexibility.


    I was attempting to split out the storage/retrieval of vertices/edges to a Store interface and eventually implement a sql store (mainly to figure out how horrible the performance would be). While doing so, I bumped into a couple of things I ended up changing that I wasn't really comfortable with and wanted to ask a couple of questions if that's ok.

    At this point can I just say that this is meant as critique, but rather as genuine questions from someone who hasn't spent much time dealing with graphs. I'm just trying to understand the thinking behind the design decisions and avoid making mistakes because I did not understand or think of something.

    This PR is a work in progress and is mainly to get a discussion started. In its current state it only includes an in-memory store as a POC, but tests should be all passing. (Memory store tests are needed.)

    I'd be more than happy to work with you on getting this right and merging it upstream, but I totally understand if this is not something you'd be interested in; in which case please feel free to close this PR at any point. :)


    So, couple of questions:

    1. What's the need for the un/directed.edges attribute? From what I can tell, in both directed and undirected it's the same as outEdges. I've kinda removed it as I wasn't really sure on what the usecase was.

    2. Currently the Edge[T] struct contains the source/target vertices. Besides a more developer-friendly API, is there a reason for the source/targets to not just be hashes? I played around with using Edge[K] instead, as it greatly simplifies the Store implementations.

    3. Do you have any plans currently on adding extra attributes to the Edge besides weight? ie. Labels maybe?

    4. The readme mentions that multigraphs are not supported. Is this something you are planning or would be interested in? It would pose a number of interesting questions around the API for both the graph and the Store.


    Thank you very much in advance! Regards, ~geoah

  • 2

    allow vertices to have weights and attributes

    This adds the possibility for vertices to have weights and attributes - just like edges. It solves #32 and is based on the idea from the issue. Graphs store the vertices' properties in a map based on the hash value.

  • 3

    New public API

    At version 0.8.0, there is a central Graph interface that offers two kinds of methods: Methods to build, modify, and "read" the graph, and methods to perform operations and algorithms on the graph. These two things should be separated. Algorithms shouldn't be the graph's concern.

    What I mean by "reading" is to retrieve vertices and edges.

    I'm thinking of a lighter Graph interface that only has methods for building and reading the graph itself, and standalone package functions for algorithms such as pathfinding, that accept a Graph instance and do the rest themselves.

    The obvious challenge is to find an interface surface for Graph that satisfies the needs of pretty much all algorithms, which was the reason why I didn't go for this approach in the first place. Now that there are many functions and methods in place, figuring out the needs for the algorithms should be a bit easier.

    Example for finding SCCs:

    Old:

    g := graph.New(graph.IntHash, graph.Directed())
    scc := g.StronglyConnectedComponents()
    

    New:

    g := graph.New(graph.IntHash, graph.Directed())
    scc := graph.StronglyConnectedComponents(g)
    

    Why?

    The advantage is that this scales much better. For 20 new algorithms, the Graph interface doesn't grow by 20 new methods.

    This is relevant in that there are many algorithms that roughly look the same for directed and undirected graphs, and yet they need to be implemented twice. A standalone package function would not only make the Graph interface simpler but also prevent code and test duplication.

  • 4

    Change `TopologicalSort` and `TransitiveReduction` to fail at runtime for acyclic graphs

    With release 0.12.0, TopologicalSort and TransitiveReduction check whether the graph actively prevents the creation of cycles with the PreventCycles option being set.

    In the future, those functions should not rely on the graph not having any cycles, but should just run and fail if a cycle is detected. The preceding isDag() check should be removed.

  • 5

    Escape node id in graphviz dot diagram

    Hello, this is a really cool library. Thanks so much for creating it!

    I found a small bug in the dotTemplate used to render graphviz dot diagrams; it doesn't quote node ids or escape special characters.

    Could we improve support by adding quotes to the dotTemplate and escaping any quotes in the node id?

    Here's an id i've got in my graph for reference: /home/folder:hello.world.

    If i manually quote the strings in the generated dot diagram it works.

    strict digraph {
    
    	"/home/folder:hello.world" -> "something/else" [  weight=0 ];
    
    }
    
  • 6

    Draw straight lines

    image

    func DrawGraph() {
        // new a graph    
        g := graph.New(graph.StringHash, graph.Directed())    
        
        // add vertexes    
        for vertex := range file2index {    
            g.AddVertex(vertex, graph.VertexWeight(5))    
        }    
        
        for i := 0; i < fileAmount; i++ {    
            forerunner := index2file[i]    
            for j := 0; j < fileAmount; j++ {    
                if directedGraph[i][j] {    
                    successor := index2file[j]    
                    // add egdes    
                    g.AddEdge(forerunner, successor)    
                    fmt.Println(forerunner, successor)    
                }    
            }    
        }    
        
        // save the graph    
        file, _ := os.Create("./mygraph.gv")    
        draw.DOT(g, file)    
        
        // print strongly connected components    
        scc, _ := graph.StronglyConnectedComponents(g)    
        fmt.Println(scc)    
        
        // print topological sort result    
        order, _ := graph.TopologicalSort(g)    
        fmt.Println(order)    
        
    }
    
    

    Hi bro, can you tell me how to make edges straight? Just like your examples of readme.md. Thank you very much!

  • 7

    (In-)Neighbors Directed Graph

    Hi,

    I am trying to find the in-neighbors for a node in a directed graph. To do so, I thought about using the AdjacencyMap since the documentation says:

    AdjacencyMap computes and returns an adjacency map containing all vertices in the graph. There is an entry for each vertex, and each of those entries is another map whose keys are the hash values of the adjacent vertices.

    However, it turns out, that in this example (omitting error checks for shortness):

    g.AddVertex(0)
    g.AddVertex(1)
    g.AddVertex(2)
    
    g.AddEdge(0, 1)
    g.AddEdge(0, 2)
    
    m, err := g.AdjacencyMap()
    log.Println("Testing key 0: ", m[0])
    log.Println("Testing key 1: ", m[1])
    log.Println("Testing key 2: ", m[2])
    

    we get:

    Testing key 0:  map[1:{0 1 {map[] 0}} 2:{0 2 {map[] 0}}]
    Testing key 1:  map[]
    Testing key 2:  map[]
    

    Empty maps for node 1 and node 2 while they do have node 0 as in-neighbor. This is not necessarily wrong if one says that vertices are only adjacent if there is an edge from the source vertex to this vertex. However, for directed graphs this is not well-defined without further information.

    Hence, I'd say at least this should be described better in the documentation. Even more, I want to propose something along a neighbors(vertex) function which could then also be configured with in- and out-neighbors in the case of directed graphs.

    Let me know how you think about this proposal and if there are meaningful starting points for this.

  • 8

    Update documentation to explicitly ignore errors returned by `AddVertex`

    At the moment, AddVertex calls in the documentation (README + GoDoc) look like this:

    g.AddVertex(1, 2)
    

    But AddVertex returns an error, and that should be made clear in the docs. The error should be ignored explicitly, just as with the AddEdge calls:

    _ = g.AddVertex(1, 2)
    

    Again, this should be done in both the README and the code comments.

  • 9

    Priority Queue implementation using heap

    Hi @dominikbraun, this PR is for issue #29. This implementation is mostly from this Priority Queue example from container/heap package.

    Let me know if something needed to be updated since I made some adjustments on collection.go and its tests. Thank you!

  • 10

    Implement priority queue using a binary heap

    The current priority queue implementation uses a simple slice to store its items.

    This approach is simple, but because all the entries in the queue need to be sorted by priority descendingly, inserting a new item has O(n) performance. Using a binary heap will optimize this towards O(log n) time complexity.

    • https://en.wikipedia.org/wiki/Priority_queue
    • https://bradfieldcs.com/algos/trees/priority-queues-with-binary-heaps/
  • 11

    New public API

    Closes #25.

    What this PR will do:

    • Remove the ByHash functions and make all of their sister functions accept hashes instead. Example: Degree(vertex T) and DegreeByHash(vertexHash K) is unified into Degree(vertexHash K).
    • Change all method names except getters to the form verb + subject. Example: Vertex becomes AddVertex.
    • Remove functions that don't modify but operate on the graph from the Graph interface and make them standalone functions that accept a Graph instead. Example Graph.DFS() becomes DFS(Graph).
    • Introduce the new Vertex and Predecessors methods.
    • Make all functions that could potentially fail return an error.
  • 12

    ShortestPath does not acually return the shortest path on a big graph

    This question is from advent of code Day 12 part 1. https://adventofcode.com/2022/day/12

    Problem

    The heightMap is a 2d array with shape [41 143], when I create a graph and input all the data, ShortestPath returns a path that looks to be semi random, ranging from 500 to 600.

    After this failed I implimented my own Dijkstra's Algorithm and got the accepted answer 468. Maybe you can look into Dijkstra's Algorithm?

    This method works fine with the small example on the question page, but no the much bigger input data.

    Code

    func findShortestPathToHighest(heightMap [][]int, start Coordinate, end Coordinate) int {
    	coordinateHash := func(c Coordinate) string {
    		return strconv.Itoa(c.X) + "," + strconv.Itoa(c.Y)
    	}
    	g := graph.New(coordinateHash, graph.Directed())
    	for y, line := range heightMap {
    		for x, _ := range line {
    			err := g.AddVertex(Coordinate{X: x, Y: y})
    			if err != nil {
    				panic(err)
    			}
    		}
    	}
    
    	for y, line := range heightMap {
    		for x, height := range line {
    			neighbors := getNeighbors(heightMap, x, y)
    			for _, neighbor := range neighbors {
    				if heightMap[neighbor.Y][neighbor.X] <= height+1 {
    					err := g.AddEdge(coordinateHash(Coordinate{X: x, Y: y}), coordinateHash(neighbor))
    					if err != nil {
    						panic(err)
    					}
    				}
    			}
    		}
    	}
    	shortest := 999999999
    	for i := 0; i < 1000; i++ {
    		path, err := graph.ShortestPath(g, coordinateHash(start), coordinateHash(end))
    		if err != nil {
    			panic(err)
    		}
    		if len(path)-1 < shortest {
    			println("new shortest:", len(path)-1)
    			shortest = len(path) - 1
    		}
    	}
    	return shortest
    }
    

    My input data

    https://gist.github.com/foxwhite25/06ee39218c9f3340056d2a14b81b3f50

  • 13

    Change `New` signature to accept `...func(*Properties)`

    This is a v1.0 feature. As long as graph is in version 0, it must not be implemented.

    To enable #66, the function signature of New has to look as follows:

    func New[K comparable, T any](hash Hash[K, T], options ...func(*Properties)) Graph[K, T]
    
  • 14

    Unify graph traits and properties

    This is a v1.0 feature. As long as graph is in version 0, it must not be implemented.

    At the moment, there are Traits and Properties. Traits contains graph traits such as IsDirected, and Properties contains dynamic graph properties such as Attributes. These two types shall be unified in a new Properties type.

    Motivation

    The reason is that New only accepts functional options for traits as a variadic parameter:

    func New[K comparable, T any](hash Hash[K, T], options ...func(*Traits)) Graph[K, T]
    

    Accepting a ...func(*Traits) implies that you can't pass functional options for Properties (func(*Properties)). An example for a functional option for Properties is Attribute(key, value string) func(*Properties) that adds a key-value pair to the properties, as it is the case for VertexProperties. Therefore, the user can't create a directed graph with a key-value pair directly:

    g := graph.New(graph.IntHash, graph.Directed(), graph.Attribute("name", "my-graph")) // Won't work!
    

    The motivation of this issue is to unify both in a Properties type, accept functional options for Properties as a variadic parameter of New, and allow New calls as shown above.

  • 15

    New `ShortestPaths` function for computing multiple shortest paths at once

    In situations with a fixed starting vertex and a number of different target vertices, calling ShortestPath for all of these vertices is quite expensive. Because ShortestPath already computes the cost of reaching a vertex from the start for all vertices in the graph, the algorithm can possibly easily be optimized for multiple targets.

    This should be implemented in a new function ShortestPaths[K comparable, T any](g Graph[K, T], source K, targets ...K) ([][]K, error) where [n][]K contains the vertex hashes of the shortest path for targets[n].

    The difference to the current implementation is that the backtracking needs to be performed for all target vertices.

    https://github.com/dominikbraun/graph/blob/f1f0bea0dbf1b32c14d961af033e3bf1234ae7fc/paths.go#L130-L142

  • 16

    Store adjacency map instead of two maps internally?

    The current Graph implementations store at least two maps containing the edges, and adjacency maps are computed on the fly using AdjacencyMap.

    This could possibly replaced with an implementation that just stores an adjacency map internally (instead of maintaining two maps), and returns that adjacency map directly when needed.