ZomboDB Documentation

Official guides and reference documentation for ZomboDB.

Index Options

As the Index Management documentation indicated, ZomboDB supports a set of custom advanced-use options for defining how an index relates to other indexes.

Consider that you have two tables, each with a ZomboDB index, and that your typical use case is to join the tables together in a view:

CREATE TABLE book (
   id bigserial not null primary key,
   title zdb.phrase,
   author varchar(64),
   publication_date date
);

CREATE TABLE book_content (
   book_id bigint not null primary key references book(id),
   content zdb.fulltext
);

CREATE INDEX idxbook ON book USING zombodb ((book.*));
CREATE INDEX idxcontent ON book_content USING zombodb ((book_content.*));

CREATE VIEW books_with_content AS 
   SELECT book.*, 
          book_content.content,
          book AS zdb
     FROM book
LEFT JOIN book_content ON book.id = book_content.book_id;

Suppose you want to do a full-text query against the books_with_content view. The query would be:

SELECT * FROM books_with_content WHERE zdb ==> 'author:foo and content:(beer w/3 wine w/30 cheese and food)';

Unfortunately, the above query will return zero rows because the index on book (which will be the chosen index due to the book AS zdb column in the VIEW) doesn't have a column named content -- that data lives in the book_content table.

We need to tell the index on book how to find corresponding book_content using an "index link". This is done through ZomboDB's index options:

ALTER INDEX idxbook SET (options='id=<public.book_content.idxcontent>book_id');

Now, when you run the above query, it'll be able to transparently search both indexes and "join" the matching data while searching.

The options string is a comma-separated list in the form of local_field=<schema.other_table.other_index>other_field.

A maximum of 1024 comma-separated index links can be set (in the options property), and the relationship types (one-to-one, one-to-many, many-to-many) don't matter.

This is a powerful feature because it allows you to keep your data as normalized as you want while still providing the ability to perform full text queries across all of it.

Index links can also be named, such that the fields behind the link appear to be part of another "object".

Taking the example from above:

ALTER INDEX idxbook SET (options='id=<public.book_content.idxcontent>book_id');

We could have, for example, named the index link book_content:

ALTER INDEX idxbook SET (options='book_content:(id=<public.book_content.idxcontent>book_id)');

And then the query would be:

SELECT * FROM books_with_content WHERE zdb ==> 'author:foo and book_content.content:(beer w/3 wine w/30 cheese and food)';

In general, this is a convenience feature for logically separating linked indexes by their domain, however it becomes more important when defining and searching shadow indexes.

Further Discussion

What you're doing in the options='...' string is telling ZomboDB how to get from one index to another. You're not technically describing "join conditions". You're describing how to lookup data in a different index and relate it to another index.

A more complex example might be:

options='content:(id=<public.book_content.idxcontent>book_id), 
        checkout_history:(id=<public.checkout_history.idxcheckout_history>book_id), 
        users:(checkout_history.user_id=<public.users.idxusers>id)'

So imagine two more tables named checkout_history and users with schemas as you might expect, both of which have ZomboDB indexes.

Behind the scenes, ZomboDB builds a graph of the relationships you define, and dynamically solves how to answer your query. It's even able to see through multiple-levels of indirection.

With the above definition, you'd be able to find all the books checked out by a particular user:

SELECT * 
  FROM book 
 WHERE book ==> 'author:shakespeare and users.full_name:"John Doe"'

The relationships need not be relative to the book table, as shown by the users:(checkout_history.user_id=<public.users.idxusers>id) link. ZomboDB understands that in order to get to the users index, it first has to go through checkout_history, and it knows that checkout_history links to book via the checkout_history:(id=<public.checkout_history.idxcheckout_history>book_id) link.

If you were to write the query as SQL, it would look like:

SELECT * 
 FROM book 
WHERE author ILIKE '%shakespeare%' 
  AND id IN (SELECT book_id 
               FROM checkout_history 
              WHERE user_id IN (SELECT id 
                                 FROM users 
                                WHERE full_name = 'John Doe'
                               )
             );

Rather than setting options='...' on the index, ZomboDB also provides the ability to specify them at runtime. To do that you want to make sure there's no options='...' property on the index, and then instead specify the link options using the zdb.link_options() function:

SELECT * 
  FROM book 
 WHERE book ==> zdb.index_options(
        ARRAY[
                'content:(id=<public.book_content.idxcontent>book_id)', 
                'checkout_history:(id=<public.checkout_history.idxcheckout_history>book_id)', 
                'users:(public.checkout_history.user_id=<users.idxusers>id)'
        ],
        'author:shakespeare and user.full_name:"John Doe"'
    );

Shadow Indexes

Shadow indexes are indexes that use an existing ZomboDB index, but let you specify different options. This is useful if an index is used in many different SQL-level views (or JOIN) situations where different linking options='...' are desired.

Shadow indexes do not consume additional disk resources, so they're "free" to create as needed.

In order to create a shadow index, you first need to make a custom UDF to use as the first column of the CREATE INDEX statement, and as the left-hand-side of the ==> operator. This is necessary so that Postgres will decide to use the shadow index instead of the real index.

You also need to define a new index (via CREATE INDEX) that specifies a WITH parameter named shadow instead of using the url parameter. The shadow argument is simply a boolean, whose value should be true.

Custom "shadow function" UDF

First, make a function that is defined exactly as below. You can change the name of the function:

CREATE OR REPLACE FUNCTION my_shadow_func(anyelement)
    RETURNS anyelement
    IMMUTABLE STRICT
    LANGUAGE c AS '$libdir/zombodb.so', 'shadow_wrapper';

Again, this function will be used for CREATE INDEX and queries.

The shadow Index

Next, create a shadow index:

CREATE INDEX idxshadow ON (book) 
       USING zombodb(my_shadow_func((book.*))) 
        WITH (shadow=true, options='<custom set of options>');

This index is set to use the existing index named idxbook and doesn't consume additional disk space or overhead when updating.\ Think of it as a "view" on top of another index.

Query

SELECT * 
  FROM book 
 WHERE my_shadow_func(book) ==> 'shakespeare';

Using the custom function you made above will allow Postgres to choose the shadow index that also uses that function (idxshadow) and then ZomboDB will apply the options='...' from the shadow index rather than the base idxbook index.

Usage with Views

If you want to use this in a view, it is required that you include my_shadow_func() in the output list and that it be aliased AS zdb. For example:

CREATE VIEW test AS 
   SELECT *, my_shadow_func(book) AS zdb FROM book;

Then you can query it as:

SELECT * FROM test WHERE zdb ==> 'shakespeare';

And of course, the view can be as complex as you need and can include whatever other tables you might want.

It's important to remember that ZomboDB is only going to return matching rows from the base table (the table specified as the argument to my_shadow_func()), so you'll need to structure your view accordingly.

Finally.
The missing piece.

Get Started Today

Connect with ZomboDB