Tag: graphx

  1. Performance Tuning Spark WikiPedia PageRank

    In my previous post I wrote some code to demonstrate how to go from the raw database extracts provided monthly by WikiPedia through to loading into Apache Spark GraphX and running PageRank. In this post I will discuss my efforts to make that process more efficient which may be relevant…

    on tutorial spark scala graphx

  2. Computing WikiPedia's internal PageRank with Apache Spark

    Recently I have spent a lot of time reading and learning about graphs and graph analytics which naturally drew me to Apache Spark GraphX having previously played with Neo4J. The benefits of GraphX are: fully open source scalable using the Apache Spark model written in Scala which I have been…

    on tutorial spark scala graphx