LIquid: The soul of a new graph database, Part 1
July 23, 2020
Editor’s note: In this two-part blog series, we introduce LIquid, a new graph database built by LinkedIn to support human real-time querying of the economic graph. It is a complete implementation of the relational model that supports fast, constant-time traversal of graph edges with a relational graph data model that is simple and self-describing, yet still manages to support the definition and indexing of complex n-ary relationships and property graphs. LIquid’s interface is a declarative query language based on Datalog. LIquid’s query processing engine achieves high performance by using dynamic query planning on wait-free shared-memory index structures.
Part 1 below will describe how graph data relates to applications, specifically the in-memory graph of structures commonly used as “models” in the sense of “Model-View-Controller” as understood by graphical applications. Since the reason for having a database is ultimately applications, this seems like a good place to start.
In part 2, we will describe how graph data can be stored in a conventional relational database, and why conventional relational databases don’t work very well for this purpose. Rather than scrapping the relational model completely, we will go on to show how relational graph data can be processed efficiently.
Introducing LIquid, LinkedIn’s in-house graph database
Our team at LinkedIn has spent the last four years building a new graph database named LIquid. LIquid is a single, automated service that replaces the hand-coded queries that sometimes added several hundred milliseconds to page load times for members. It can be queried with a powerful general-purpose query language and return the necessary results in an optimal fashion.
Why does LinkedIn need a graph database? We’ll get to a more formal answer later in this post, but here’s the intuition: the value of an economic graph for the average member lies mostly in their second degree network. These are the connections of your connections, such as the colleague of an old school friend, or the new boss of a co-worker from a prior job.
On LinkedIn, your first degree network is likely small, typically averaging a few hundred current or former coworkers, and other people that you already know. Your third degree network is likely very large, but it is hard to act on, say by leveraging these relationships to get a job, because doing so requires making two sequential new connections. However, the set of second degree connections is typically at least 50,000 entities (e.g.,people, schools, employers). It is large enough to be diverse, but at the same time, actionable: you are one new connection away from something that can happen. In fact, your next job is probably in your second degree network.
A first degree network of, for example, 250 connections, would be easily handled by a simple table or key-value store. Second degree networks are a much more challenging software engineering problem because the set of second degree connections is too large to pre-materialize and store, and the write amplification involved in keeping a pre-materialized second degree connection set up to date—roughly 250 times the base write rate, or once for each first degree connection—makes this approach impractical. On the other hand, computing second degree connections on demand is no picnic either. The first degree is easy to materialize, but thereafter, the join to produce the second is daunting, particularly if you have to search a table of billions of edges stored in conventional sorted storage (typically some sort of terabyte-scale B-tree). For an average first degree, this join will be extremely sparse, effectively 250 random look-ups.
For these reasons, LinkedIn has built a series of increasingly general-purpose graph serving systems to provide real-time access to the core of the economic graph. Initially, the graph only stored connections in a very simple format, a tuple of (source, dest, score). Subsequent versions were extended to include other types of edges such as employment and education. However, edges were limited to 2 integer endpoints and 64 bits of attribute data, (type, source, dest, attributes). LIquid is the most recent of these systems, and the first one which is general purpose: a database that implements the relational model, fully described by a schema and accessed by a declarative query language with functionality comparable to SQL. In LIquid, edges are triples of strings, (subject, predicate, object), and compounds can be built out of edges to represent n-ary relationships with arbitrary attributes.
From objects to graphs
Faced with the problem of working with second degree connections, most programmers would probably start out by putting everything in memory as objects and chasing pointers. For application programmers, this has been the predominant approach to modeling the world since Smalltalk popularized object-orientation and the model-view-controller approach to application development in the 1970s. While sequential access is always faster, random access in memory is fast enough that programmers are pretty blithe about adding an extra layer of indirection. The stunning complexity of modern UI suggests that second degree networks, which might take thousands of random accesses to produce, should be tractable to work with in human real time. Browsers, word processors, and spreadsheets all traverse tens or hundreds of thousands of pointers rapidly enough for people to consider them instantaneous.
Since our goal is to build applications, let’s start with a conventional application model, a graph of objects. If we were to build a graph of authors and books, it might look something like this:
The first question to ask is, “Why aren’t we done? Can’t we just keep a large graph of objects ready to use in main memory?
In fact, variants of this approach were tried by a variety of object-oriented databases, OODBs, in the 1990s, and while they succeeded in certain use cases, none of them worked very well in general. OODBs died off while relational systems prospered. Significant problems with OODBs were:
- Query language: There is no standard declarative OO query language, hence no ability for the database to optimize unconstrained by imperative logic. The premise of object-orientation was mixing code and data into “objects.” But if the programmer coded an inefficient access pattern in an imperative language, the OODB had no choice but to comply.
- Snapshot consistency: A graph of objects in memory is a snapshot of the state of the world at a single point in time. A consistent snapshot is an extremely expensive illusion to maintain—traditionally via multi-version concurrency control, utterly intractable as soon as the graph is larger than a single machine.
- Schema: While the OO premise of modeling objects directly is attractive, it comes with intractable schema development problems. Obvious uses of inheritance lead rather quickly to insurmountable difficulties. The following questions arise: How do we extend our example to cover other relationships between people and books such as editors? If an editor is a similar subclass of person, how do we represent people who are both authors and editors? What about non-human authors, groups or pseudonyms? Given the ease of creating flawed schema, the fact that OO languages have no answer for schema evolution just compounds the problem. The de-facto answer to the schema evolution problem is to create new classes, ship them in a new executable, and somehow manage to read old serialized data (application files). Sometimes this works.
- Relational + ORM?: In practice, there were few situations in which OODBs were substantially better than a relational database combined with an object-relational mapping layer that maintained the graph of objects as in-memory cache. For example, ObjectStore, which faulted in pages of native C++ objects, excelled in cases where an ORM cache would take days to load, circuit layout or industrial process display, but for run-of-the-mill database backed applications, the native C++ performance was trumped by the concerns listed above.
So, rather than dusting off some OODB, let’s explore how we might turn a graph of objects into graph data. The key insight is that the relations between books and authors already exist in the form of arguments to NewAuthorBook. NewAuthorBook updates the reciprocal fields, books_written and authors, which function as inverted indexes. The application program only uses the inverted indexes so the original author-book relation is simply discarded when NewAuthorBook is done.
If we’re just concerning ourselves with authors and books, a satisfactory solution is to make the relationship between people and books more general. Something like this will allow any sort of relationship between people and books:
While this strategy can be made to work in any specific case, it does not work when applied more generally. We’ll have the exact same problem again when we create movie_person. Furthermore, even the simple assumption that authors are people is problematic. There are pseudonyms, “Mark Twain,” and groups, “The Beatles,” that need a way to be represented as authors. The crux of the matter is that objects encode the type of relation in the class definition, specifically in the field offsets of the inverted indexes, offsetof(author, books_written) and offsetof(book, author). To express a durable model of the world, this rigid encoding simply doesn’t work. In practice, object-oriented programs work around this problem by creating a series of executables, each encoding some specific schema and storing the data elsewhere (relational tables, documents) using a more flexible schema.
However, adding the type of the relationship to the inverted index is a step in the right direction. Since the use of fixed offsets and structure extension is actually the cause of schema problems, let’s generalize the notion of “type of relationship” to “predicate” in a generic edge:
Notice that all objects are now structurally identical instances of identity, vectors of edges that refer to the object with one of subject, predicate, or object. There’s no schema evolution problem because there’s only one schema. The user can make up instances of identity that are predicates and start using them at any time. The cost of this flexibility is type-specific getter methods that compute their results by filtering edges for those with a matching predicate. This becomes a bit cumbersome if all we want to do is access a field. However, when you consider chasing pointers through a large graph of objects, every pointer dereference is an L3 cache miss, about 100ns. Scanning a few words in between dereferences doesn’t hurt very much, say an extra 20% (that would be about 40 L1 loads), tops.
The getters are very simple queries. We can use the same data structures to write more complex queries. Suppose that we have two identities and we want to know if they are related–if an edge between them exists. Here’s that query:
Imagine that a has a huge fan in, millions of edges, while b has just a few. It would be much better to scan b’s edges looking for a, than vice versa. Our uniform inverted index makes this query easy to implement, and the path to cost-driven evaluation of more complex queries, perhaps represented as little graphs of edge constraints, is easy to follow.
So, we’ve gained a flexible schema, and the ability to write queries. While writing application code in this idiom is much more cumbersome, from the standpoint of a database, this isn’t a problem. Application code consumes resources unpredictably and sometimes crashes, so we’re not going to be running application code in the database. Instead, we will be running queries (expressed in a declarative query language like SQL) in the database and shipping the results back to the application code that can use whatever classes it wants to represent the result graph.
As data, our graph has one more problem: there’s no way to refer to an instance of identity, a vertex, other than with a pointer to its location in memory. If the application program is going to run in a different address space, it needs a durable way to refer to identities. Durable identities need to come from the user, commonly as strings, things like “urn:author:1234”. The database can map these strings to pointers to instances of identity, wherever they may happen to be. Handily enough, strings can encode pretty much anything—URNs, names, numbers, dates—so an edge consisting of three strings is completely general. We can say (“urn:author:1234”, “wrote”, “urn:book:4567”), or (“urn:author:1234”, “name”, “Jack Kerouac”); the graph machinery doesn’t really care. What it cares about is the identity (a number) that corresponds to the string.
Thus far, we’re still relying on snapshot consistency. For a query such as FindPredicates to get consistent results, nobody else can change any of the objects that it reads. Simple strategies like locking objects one at a time as the query encounters them are hopelessly slow, deadlock-prone, and wrong. What we need are transactions that behave as if they obtained read or write locks on an entire set of objects in an instant. While transaction processing is extremely well-studied, the sad truth is that relative to a joyful romp through a snapshot in RAM, transactions perform poorly, distributed transactions, extremely poorly. If we’re going to compete with pointer-chasing, we need access to the indexes at pointer-speed, with basically no coordination whatsoever, not even a compare-and-swap.
If we restrict ourselves to a single writer, there is a small vocabulary of data structures that have wait-free implementations: logs, linked lists, hash tables. This is just barely enough for our purposes. To start, the writer will append edges to a single “graph log.” Each edge can then be referred to by its offset in the log and the log offset functions as a virtual time which allows us to define consistency as “answering a query ‘as of’ a certain log offset.” The index will contain copies of edges from the graph log. Each identity/object in the index is essentially a special-purpose mini-log containing just edges that refer to that object. In order to use the indexes to compute a consistent result, we decorate each edge with its offset from the graph log. Something like this (if the vectors were wait-free):
To generate a consistent result, we choose a log offset, probably the current end of the graph_log, and then scan the index log backwards discarding edges until the offset is less than our chosen log offset. Thereafter, any edges we find may be part of the answer to our query.
Thus, relative to a snapshot, we have to do a little more work. But the work is uniformly distributed–there’s no possibility of lock contention or deadlock—and it can take place at the maximum speed that the processor is capable of. In theory, the single writer is a bottleneck, but in practice, we have no trouble appending edges to the log at a rate which exceeds the write rate of a large social graph by many orders of magnitude. An ever-growing log would be a liability for edges which frequently change, but most human-curated data does not have this property. Additions occur at human speed, and deletions, which we handle by appending a tombstone edge, are relatively infrequent.
While heavily optimized for both space and speed, these index structures are the basic index data structures that LIquid uses. They deliver fast, constant-time performance that allows graph edges to be used blithely, much as one would use pointers. While we still have much to describe, the outline of LIquid is clear: we build a log-structured, in-memory inverted index of edges, triples of strings. Internally, the strings are converted to integers, pointers to structures in the object-oriented formulation. We process queries using this index and return the results to the application program as a set of edges, a subgraph. The application program converts the edges into whatever structural idiom is convenient, a graph of structures or a DOM tree, and displays the results to the user. The ingredients for this conversion are simple: mappings from identity string to address or structure offset and possibly a topological sort of result edges so that the target hierarchy can be built bottom-up.
In the next part, we’ll describe how to express a graph in the relational model, why conventional relational databases don’t work very well for this purpose, and how LIquid can process relational graph data efficiently.