12: what about cycles?
(NOTE: this section digs into some intricate details of how Specmonstah works. It might be confusing and it could use improvement.)
Let's say that in your forum all posts reference a parent topic, and that topics reference their first post. That means that for every topic, its :first-post-id
that should be set to a post's :id
, and that's post's :topic-id
must reference the topic's :id
. What we got here is a good ol' fashioned cycle:
This graph shows how the ents are related: :p0
, :p1
, and :p2
all reference the topic :t0
, and :t0
references :p0
.
One problem that could arise in this situation is that your database might have a foreign key constraint on the post
table requiring that a post's :topic-id
be NOT NULL and that it reference a topic that actually exists. The topic's :first-post-id
might allow NULL values, but have the constraint that the field reference a post that actually exist. However, when cycles exist in a graph there's no way to know which of the two nodes in the cycle should come first. How do we make sure that Specmonstah does the following?
Inserts a
:topic
without a value for:first-post-id
Inserts a
:post
that references the:topic
Performs an
update
on the:topic
to set it's:first-post-id
to the:post
's:id
There are two parts to solving this problem. The first is to ensure that ents are sorted correctly for visiting functions. Since you can't topologically sort a cycle, we need some way to tell Specmonstah how to get from a graph with a cycle to a graph without one. We do that with the :required
constraint. Have another look at the schema:
You can see that the the :post
definition includes :constraints {:topic-id #{:required}}
. This is how you tell specmonstah, "Make sure that the :topic
that this :topic-id
refers to gets visited before this :post
". Internally, this instructs Specmonstah to create a temporary graph where the directed edges from topics to posts are removed, and then topologically sort that graph. You can see it at work:
The second part to solving this problem is to break your visiting function up into multiple functions:
Here we're creating a visiting "function" named conform
which is actually a vector of functions. The insert
function is applied to all ents in topsort order, then the update-keys
function is applied to ents in topsort order. You can see that :t0
is "inserted" without a :first-post-id
, and then after all records have been inserted :t0
is "updated", setting its :first-post-id
.
This introduces one more question: How does the visiting function know to leave out :first-post-id
when the insert
function is applied, and add it when the update-keys
function is applied?
The answer has multiple parts. First, let's look at the schema again:
Notice the :conform
key in :topic
schema. The :conform
key doesn't have any special meaning; we name it to match the visiting key for the conform
visiting function. When the conform visiting function is applied, the value of the :conform
key is passed in under the :schema-opts
key. You can see that the insert
function uses this:
The insert function creates the record
to be inserted by dissoc
ing any keys found in :cycle-keys
of schema-opts
.
Last updated