Notes on High Performance Collaborative SGD (actually it is Gradient Descent) in GraphMat and Ligra

Here is my notes on understanding, running and testing high performance collaborative filtering in GraphMat, a state of the art graph processing framework from Intel.

I implemented an equivalent algorithm in Ligra, another high performance graph processing framework.

This post is divided into two parts,

  1. the first part examines what Graphmat’s implementation does
  2. the second part demonstrates how you write an equivalent version in Ligra.

Many of the notes and instructions here are based on communications with Narayan, who is the first author on GraphMat framework.


GraphMat at VLDB 2015

Ligra at PPoPP 2012

GraphMat’s SGD code,

(1) High Level Points to Note

The code is names SGD, but it is actually gradient descent, because each node is accumulating all the errors.

This Gradient Descent does double buffering in that it updates the latent vectors after everyone has accumulated the error message.

On a high level, what it does is that it uses a buffer for latent_vector and another buffer for error. And it clears the error buffer at each iteration, and add the delta to the latent_vector buffer at the end of each iteration.

(2) GraphMat API Fundamentals and comparison to Ligra’s EdgeMap and VertexMap APIs.

There are a few functions that are important in understanding how GraphMat works.

send_message, imagine this as the data that you send from src of each edge to the destination. Normally, this would be a function of the vertex data. For example, in PageRank, this message would be the rank of the src node/output degree, in GD collaborative filtering, this is the latent_vector.


This receives the message from the source at the destination side and do some additional processing.

For PageRank, it doesn’t need to do additional processing, it will take the raw contribution from the source  (rank of source /out degree of source).

For GD (SGD but is actually gradient descent), it calculates the error for the edge, and multiply the source’s latent vector by the error.


This reduces all the messages from the neighbors of the current node. For both PageRank and

Send_message, process_message and reduce_function can be implemented as a user function passed into EdgeMap in Ligra as a single call to EdgeMap.


A post processing step that works on each vertex. It is an equivalent of VertexMap in Ligra.

Overall GraphMat is more restrictive than Ligra in terms of programming interface. Everything that GraphMat can do, Ligra’s interface will also be able to express it.

(3) Caveats about GraphMat.

Another trick to GraphMat, it differentiates between the order of the programs.

this->order = ALL_EDGES will process both IN_EDGES and OUT_EDGES. this is a tricky part to it. The IN_EDGES would work as EdgeMapDense. OUT_EGES would be EdgeMapDenseForward.

GraphMat doesn’t do the switch bewteen Push and Pull. So I would expect it to be much slower when doing BFS or other non-all-active graph programs (BC, for example).

(4) Specifically, how to run Collaborative Filtering in GraphMat

This is a simple ratings text file with 4 users and 3 items (7 vertices, 7 edges; items numbered after users)-

7 7 7

1 5 1

1 7 2

2 5 2

2 7 4

3 6 2

3 7 3

4 7 3

Copy this into ratings.mtx and convert to graphmat format by:

bin/graph_converter ratings.mtx ratings.bin.mtx

run SGD

bin/SGD ratings.bin.mtx

(5) How does GraphMat’s collaborative filtering work with a bit more detail

In ProcessingMessage, it gathers the latent_vector of the source node, and calculates the error using the dot product of the latent_vectors of source and destination. With the error, it updates the message gathered from the source, now the message represents the error from the specific edge, not the source’s latent vector.

See details in process_message.

Each source node, then accumulates all the errors from sources by summing them up (see reduce_functon).

This phase is an edge centric phase. The next phase uses the error message and lambda to update the latent vector. This second phase is a vertex centric phase.

It uses the formula here to collocate the updated latent_vector[i] += step*(-lambda*[i][i]);

(6) How did I implement it in Ligra

Following the mapping described before, with an edgeMap (edge centric phase) and vertexMap (vertex centric phase)

(7) How did I verify correctness between Ligra and GraphMat’s version

I summed the up the error^2 for all edges.

This entry was posted in High Performance Computing and tagged . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s