Text mining my paper with an online text visualization tool

Today, I was introduced with an online text visualization tool from my course colleagues. This tool can analyze any text into an interactive network. very cool!

I used it to analyze one of my conference paper, which was focused on online social learning. Here is the link to the text network.

Screen Shot 2015-04-02 at 10.14.45 PM

Try to embed it, but seems it doesn’t show up ūüė¶


Twitter hashtag analysis #justdoit

I used ¬†NodeXL¬†(can only be used in Windows OS) and Gephi to analyze¬†a Twitter Search Network: #justdoit. I am not a super fan of Nike but my roomie is ūüôā

In the dataset, each twitter account that mentions the hashtag is a node. If the tweet is a reply to another tweet or a mention of another tweet, an edge is added between these two accounts. Here is how the data laboratory looks like in Gephi (the original datas was generated automatically in NodeXL):

Screen Shot 2015-02-16 at 10.44.23 AM



Here is the first network showing up once I open the data file. This network is meaningless for us, so it needs modifications:

Screen Shot 2015-02-17 at 1.03.24 PM

Then, we need to calculate in statistics. (Details found on Gephi wikipedia http://wiki.gephi.org/index.php/Category:Measure)

  • Avg. weighted degree:¬†Average of sum of weights of the edges of nodes. (differences:¬†Average Degree: Simply the sum of edges of a node.) We always use avg. weighted degree, I think.
  • Modularity:¬†Measures how well a network decomposes into modular communities.
  • Eigenvector centrality:¬†A measure of node importance in a network based on a node’s connections.

Screen Shot 2015-02-17 at 1.13.31 PM


Then, we need to adjust the nodes min and max size; apply modularity class;

The next step is very important, because we have many communities in the network which are not important for us, so we only show the first three communities:

Screen Shot 2015-02-17 at 1.25.18 PM

Then partition the edge relationship: tweet, reply and mention; Run a layout algorithm Рforce atlas to get a better layout; show the labels for nodes and run a label adjust algorithm.

Here is my result for the first three communities using #justdoit:




(important steps for creating a beautiful network: statistics,node,modularity,edge,labels,layout)

From the analysis, we can see that nikejapan, nike, and nikewomen_jp are the first three #justdoit¬†communities. Why do Japanese enjoy running this much? What comes to my mind firstly is the book –¬†What I Talk About When I Talk About Running¬†written by a famous Japanese writer¬†Haruki Murakami. It seems that his book has made a hit!

My Facebook Friends Network

First, I¬†must get Gephi installed in my MacBook. It’s very easy to install it¬†in Windows OS but for apple OS,¬†this process was full of pain. It took me almost 7 hours to troubleshoot all¬†problems. Finally, I won. I followed these two posts and tried all potential ways. If you got the same issue, the only thing you can do is either giving up or keeping yourself¬†cool¬†and try all these tips shown in the two posts.
Then,¬†I¬†downloaded my¬†personal friends network out of Facebook¬†on¬†NetGet Application¬†and it was used to save my friends’ network in a¬†GML File. In this process, you’d better use Google Chrome to download the gml file. Safari is not professional. When I used Safari to save the file, it was automatically saved as a txt file, which can’t be opened in Gephi. For groups or pages, we can get data from Facebook apps – netvizz.
Now, I can start playing with Gephi!
1. open gml file in Gephi. The network is shown as a grey meaningless graph.
Screen Shot 2015-02-14 at 2.32.46 PM
2. layout Рforce atlas or force atlas 2 Рyou could try different parameters to adjust the layout
here is my choice:
Screen Shot 2015-03-10 at 11.48.31 AM
3. go to statistics/ run avg. path length Рgo to ranking/nodes/apply betweenness centrality, try different parameters on min/max size and color
4. go to statistics/ run modularity – go to partition/apply modularity class

network without labels

5. if you wanna show names in the graph, you can click “show nodes labels” in Graph window, and use “label adjust” in Layout.

network with labels

 6. file- export your graph
save your project
¬†Well done! ūüôā