Neo4j.py -- Python bindings for the Neo4j Graph Database

A Python wrapper for Neo4j http://neo4j.org/

Website: http://components.neo4j.org/neo4j.py/

Neo4j.py can be used either with Jython or with JPype or JCC in CPython. Neo4j.py is used in exactly the same way regardless of which backend is used.

The typical way to use Neo4j.py is:

import neo4j
graphdb = neo4j.GraphDatabase( "/neo/db/path" )
with graphdb.transaction:
    ref_node = graphdb.reference_node
    new_node = graphdb.node()
    # put operations that manipulate the node space here ...
graphdb.shutdown()

Getting started

Requirements

In order to use Neo4j.py, regardless of whether Jython or CPython is used the system needs to have a JVM installed.

The required Java classes are automatically downloaded and installed as part of the installation process.

With CPython

To use Neo4j.py with CPython the system needs to have JPype http://jpype.sourceforge.net/ installed.

To install Neo4j.py simply check out the source code:

svn export https://svn.neo4j.org/components/neo4j.py/trunk neo4j-python

Then install using distutils:

sudo python setup.py install

This requires connection to the internet since it will download the required java libraries.

With Jython

Check out and install as with CPython:

svn export https://svn.neo4j.org/components/neo4j.py/trunk neo4j-python
cd neo4j-python
sudo jython setup.py install

Windows installation issues

Jython (in 2.5b3 or earlier) has a problem with installing packages under Windows. You might get this error when installing:

running install_egg_info
Creating X:\<PATH_TO>\jython\Lib\site-packages\
error: X:\<PATH_TO>\jython\Lib\site-packages\: couldn't make directories

If the install output ends like that when installing under Windows, don't panic.

All of Neo4j.py has already been installed at this point. This can be verified by checking that X:<PATH_TO>\jython\Lib\site-packages\neo4j contains some directories, Python source files and bytecode compiled files. You can also verify that X:<PATH_TO>\jython\Lib\site-packages\neo4j\classes contains the required jar-files. What the install script has failed to do is to write the package information. This may cause trouble when installing a new version of neo4j.py, the fix for this is to manually remove neo4j.py before installing a new version.

This issue has been reported at https://trac.neo4j.org/ticket/156 and http://bugs.jython.org/issue1110. We have fixed this for the next release of Jython.

JPype installation issues

In some situations the JPype compilation process might not link with the appropriate JNI headers, resulting in compilation errors.

The first thing to note is that JPype needs the JNI headers from a JDK in order to build, it is not enough to only have a JRE installed when building JPype.

If the JAVA_HOME environment variable is not set when building JPype the build script (setup.py) of JPype might have problems locating the appropriate JNI headers.

If you are building JPype with sudo python setup.py install you might not inherit the JAVA_HOME environment variable into the sudo environment, an easy warkaround is to run python setup.py bdist before install.

For more information see the following resources:

Starting the Neo4j Graph Database

Apart from specifying the path to where the Neo4j graph data is stored to GraphDatabase a few extra keyword options may be specified. These include:

classpath
A list of paths that are to be added to the classpath in order to be able to find the Java classes for Neo4j. This defaults to the jar files that were installed with this package.
ext_dirs
A list of paths to directories that contain jar files that in turn contains the Java classes for Neo4j. This defaults to the jar file directory that was installed with this package. The classpath option is used before ext_dirs.
jvm
The path to the JVM to use. This is ignored when using Jython since Jython is already running inside a JVM. Neo4j.py is usualy able to compute this path.

Note that if the Neo4j Java classes are available on your system classpath the classpath and ext_dirs options will be ignored.

Example:

graphdb = neo4j.GraphDatabase("/neo/db/path",
                              classpath=["/a/newer/kernel.jar"],
                              jvm="/usr/lib/jvm.so")

Package content

Some of the content of this package is loaded lazily. When the package is first imported The guaranteed content is GraphDatabase and the API required for defining Traversals, the Exceptions might not be available. When the first GraphDatabase has been initialized the rest of the package is loaded.

The content of this module is:

GraphDatabase
factory for creating a Neo4j Graph Database.
Traversal
Base for defining traversals over the node space.
NotFoundError
Exception that is raised when a node, relationship or property could not be found.
NotInTransactionError
Exception that is raised when the node space is manipulated outside of a transaction.

The rest of the content is used for defining Traversals

Incoming
Defines a relationship type traversable in the incoming direction.
Outgoing
Defines a relationship type traversable in the outgoing direction.
Undirected
Defines a relationship type traversable in any direction.
BREADTH_FIRST
Defines a traversal in breadth first order.
DEPTH_FIRST
Defines a traversal in depth first order.
RETURN_ALL_NODES
Defines a traversal to return all nodes.
RETURN_ALL_BUT_START_NODE
Defines traversal to return all but first node.
StopAtDepth(x)
Defines a traversal to only traverse to depth=x.
STOP_AT_END_OF_GRAPH
Defines a traversal to traverse the entire subgraph.

Nodes, Relationships and Properties

Creating a node:

n = graphdb.node()

Specify properties for new node:

n = graphdb.node(color="Red", widht=16, height=32)

Accessing node by id:

n17 = graphdb.node[14]

Accessing properties:

value = e['key'] # get property value
e['key'] = value # set property value
del e['key']     # remove property value

Create relationship:

n1.Knows(n2)

Any name that does not mean anything for the node class can be used as relationship type:

n1.some_reltionship_type(n2)
n1.CASE_MATTERS(n2)

Specify properties for new relationships:

n1.Knows(n2, since=123456789,
             introduced_at="Christmas party")

Indexes

Get index:

index = graphdb.index("index name")

Create index:

index = graphdb.index("some index", create=True)

If an index is created that already exists, the existing index will not be replaced, and the existing index will be returned. The create flag is a measure to help finding spelling errors in index names.

Using indexes:

index['value'] = node
node = index['value']
del index['value']

Keep in mind that when updating the index with a new value (f.ex. when a property value on a node changes) remember to remove the old value from the index as well, else both values will be indexed.

Using indexes as multi value indexes:

multiIndex.add('value', node)
for node in multiIndex.nodes('value'):
    doStuffWith(node)

Traversals

Traversals are defined by creating a class that extends neo4j.Traversal, and possibly previously defined traversals as well. (Note that neo4j.Traversal always needs to be a direct parent of a traversal class.) A traversal class needs to define the following members:

types
A list of relationship types to be traversed in the traversal. These are created using Incoming, Outgoing and Undirected.
order
The order in which the nodes of the graph are to be traversed. Valid values are BREADTH_FIRST and DEPTH_FIRST
stop
Definition of when the traversal should stop. Valid values are STOP_AT_DEPTH_ONE and STOP_AT_END_OF_GRAPH Alternatively the traversal class may define a more advanced stop predicate in the form of a method called 'isStopNode'.
returnable
Definition of which nodes the traversal should yield. Valid values are RETURN_ALL_NODES and RETURN_ALL_BUT_START_NODE. Alternatively the traversal class may define a more advanced returnable predicate in the form of a method called 'isReturnable'.

To define more advanced stop and returnable predicates the traversal class can define the methods 'isStopNode' and 'isReturnable' respectively. These methods should accept one argument (in addition to self), a traversal position. The position is essentially a node, but with the following extra properties:

last_relationship
The relationship that was traversed to reach this node. This is None for the start node.
is_start
True if this is the start node, False otherwise.
previous_node
The node from which this node was reached. This is None for the start node.
depth
The depth at which this node was found in the traversal. This is 0 for the start node.
returned_count
The number of returned nodes so far.

Nodes yielded by a traversal has an additional 'depth' attribute with the same semantics as above.

Example Traversal declaration

class Hackers(neo4j.Traversal):
    types = [
        neo4j.Outgoing.knows,
        neo4j.Outgoing.coded_by,
        ]
    order = neo4j.DEPTH_FIRST
    stop = neo4j.STOP_AT_END_OF_GRAPH

    def isReturnable(self, position):
        return (not position.is_start
                and position.last_relationship.type == 'coded_by')

# Usage:
for hacker_node in Hackers(traversal_start_node):
    # do stuff with hacker_node

Further information

For more information about Neo4j, please visit http://neo4j.org/

Please direct questions and discussions about Neo4j.py to the Neo4j mailing list: https://lists.neo4j.org/mailman/listinfo/user

Copyright (c) 2008-2010 "Neo Technology," Network Engine for Objects in Lund AB http://neotechnology.com