A new beginning
21 Sep 2014I've finally got around to resurrecting my blog. Here we go again!
I recently helped add versioning support to Neo4j.rb. Versioning only works for Rails models at the moment. To add versioning to your model, you'll need include the Versioning module:
class VersionableModel < Neo4j::Rails::Model
include Neo4j::Rails::Versioning
end
Versioning creates snapshots under a given rails model instance. Note that snapshots aren't Rails models, but vanilla nodes. The relationship from the instance to its snapshot versions captures information about the snapshot, such as the version number, the class that is being versioned, and the id of the versioned model. When asked to retrieve a particular version of an model, the versioning module internally does a lucene search using all of these parameters.
Snapshots currently capture all the properties of an instance. The snapshots create links to the same nodes that are linked to an instance, with one important difference. The snapshot's relationships all have a 'version_'
prefix - the prefix is added in order to distinguish between 'regular' relationships and those created for versioning. Both incoming as well as outgoing relationships are recorded.
Let's look at an example:
class SportsCar < Neo4j::Rails::Model
include Neo4j::Rails::Versioning
property :brand
end
class Driver < Neo4j::Rails::Model
include Neo4j::Rails::Versioning
property :name
has_n(:sports_cars)
end
With version 1 of the driver, there are no links to any cars:
driver = Driver.create(:name => 'Walter Plinge')
With version 2, we link the driver to a sports car - a Porsche.
driver.sports_cars << Sportscar.create(:brand => 'Porsche')
driver.save!
Now, both the driver and the current version are linked to the Porsche.
With version 3, we link the driver to a second car - a Ferrari. The graph now starts getting complicated.
The driver, as well as the latest snapshot, are both linked to two cars - the Porsche and the Ferrarri.
To retrieve a snapshot, you'll need to call the version API.
instance.version(version_number)
The snapshot object will respond to the exact same properties as those on instance. It also allows relationships to be traversed using the incoming / outgoing API. The prefix stuff I'd mentioned above is handled transparently, so you can continue to use the same relationship names that you've used in your model.
driver.version(3).outgoing(:sports_cars) #Returns two cars as expected
To revert to a particular version, simply call
instance.revert_to(version_number)
Revert restores properties and relationships, and creates a new version. So in the driver / car example, let's see what the driver example looks like when we revert to version 2:
driver.revert_to(2)
A little while ago, I'd helped add multitenancy support to Neo4j.rb. This allows the same Neo4j database instance to support multiple tenants/customers simultaneously. There are several ways of achieving multitenancy, but the main aim of the current solution was to minimize the number of required changes to an existing codebase.
So here's how it all works.
The Neo4j.rb multitenancy approach partitions the graph in such a way that all queries and traversals are scoped to a given tenant. This means that queries like Order.all, for example would return different (and correct) results depending on what the current tenant is.
Once Neo4j supports sharding, it should be possible to adapt this scheme to take advantage of it.
The reference node is a starting point in the graph space. All Neo4j graphs are connected by default, which basically means that nodes and relationships cannot exist in isolation. The multitenancy feature works by 'moving' the reference node to whatever the current tenant is.
Neo4j.rb stores type metadata in the graph. Let's consider an example where we have two models: Tenant and Country.
In the screenshot below, the home icon represents the reference node, which the default starting point in the graph. From the reference node, each model type has an outgoing relationship, named after the model class. Neo4j.rb refers to the node at the end of this relationship as a Rule node. The 'all' rule node is connected to every instance, and its count property keeps track of the number of instances.
This particular database has a single tenant instance...
... and three countries. The _classname
property stores the Rails model class name for a given node.
While creating new tenants, it's often necessary to set up data for each tenant. Here's one way of doing this:
class Tenant < Neo4j::Rails::Model
property :name
ref_node { Neo4j.default_ref_node }
after_create :create_default_data
def create_default_data
Neo4j.threadlocal_ref_node = self
load("#{Rails.root}/db/tenant_default_data.rb")
end
end
In case it takes a while to set up the data for a client, it's worth doing the data setup in the background.
Changing the reference node to a tenant ends up effectively partioning the graph. All tenant specific entities end up getting attached under the tenant, like so for Tenant 1:
... and for Tenant 2.
Neo4j.rb allows a reference node to be set for the current thread. A :before_filter method in your controller is a reasonable place to set the threadlocal reference node.
class OrdersController < ApplicationController
before_filter :authenticate_user!, :ensure_tenant_setup!
protected
def ensure_tenant_setup!
Neo4j.threadlocal_ref_node = current_user.tenant
end
end
By default, for every Rails model, Neo4j.rb creates lucene indices using this format:
<TopLevelModulename>_<NextlevelModuleName>_<ModelClassName>-exact/fulltext
The multitenancy work adds the tenant name as an additional dimension to the index name. Each tenant node contributes an index prefix. The prefix makes it possible to partition the lucene index on a per tenant basis.
<Tenant Name>_<TopLevelModulename>_<NextlevelModuleName>_<ModelClassName>
The net effect is that lucene queries are scoped to the current tenant. (Except in the case of shared models, which are accessible across all tenants and use a single lucene index).
What about data that's shared across tenants? Examples could be a list of valid currencies, or a list of countries. It doesn't make sense to replicate this data on a per tenant basis. To handle these kind of entities, you can declare the model's reference node to be the default reference node, which ensures that all instances of the model are accessible across all tenants.
class Country < Neo4j::Rails::Model
property :name
ref_node { neo4j.default_ref_node }
end