If you would like to visualize a text as a network graph, please, use our new open source InfraNodus text network visualization tool.

Textexture is outdated and is not supported any longer. Our new text visualization tool InfraNodus supports English, Russian, German, French and has advanced import, export, sharing, and filtering features: www.infranodus.com
 

 
 
When we read a text, we normally follow it in quite a linear fashion: from left to right, from top to bottom. Even when we skim articles quickly online, the trajectory is still the same. However, this is not the most efficient method of reading: in the age of hypertext we tend to create our own narratives using the bits and pieces from different sources. This is an easy task with short Tweets or Facebook posts, but it becomes much more difficult when we're dealing with newspaper articles, books, scientific papers. The amount of information we're exposed to increases from day to day, so there's a challenge of finding the new tools, which would enable us to deal with this overload.


As a response to this challenge we at Nodus Labs developed a new free online software toolTextexture.Com, which visualizes any text as a network and enables the user to use this interactive visualization to read through the text in a non-linear fashion. Using the network one can see the most relevant topics inside the text organized as distinctively colored clusters of nodes, their relationship to one another, and the most influential words inside the text, responsible for topic shifts. This way the user can navigate right into the topic of the text that is the most relevant to them and use the bigger (more influential) nodes to shift into another subject.Project:Textexture.Com)
Objective:Provide an easy-to-use online tool for text network visualization, fast non-linear reading, and comparative analysis of textual data.
Additional Info:Created by Nodus Labs, using the open-source freeGephisoftware andSigma.Jstoolkit.


Themethodologybehind Textexture is quite simple. First, a submitted text is scanned to remove all the most frequently used "stopwords", such as "are", "is", "the", "a", etc. The second scan removes any extra characters and turns every word into its morpheme (e.g. "took" becomes "take", "plates" becomes "plate"). The resulting sequence is then scanned so that every word is encoded as a node and their co-occurrence is encoded as the connection between them. The nodes are not only linked if they are next to each other in the text. The paragraph and sentence structures are taken into account, as well as theLandscape reading model, according to which our memory creates a landscape of concepts that are more likely to be activated during the reading process.
The resulting node-edge structure is encoded into a graph format and is processed byGephiserver-side Java toolkit, which calculates the basic metrics and applies community detection algorithms for the graph. The size of the nodes is then ranged according to their betweenness centrality (a measure of how often a node appears on the shortest path between any two randomly chosen nodes in the network), thus emphasizing the words, which often appear at the junctions of meaning. This is different from highlighting the most frequently used words as most programs do, because the words with the highest betweenness centrality are not necessarily often used within a text. They are rather the keywords that are often used to switch from one context to another within the text. Obviously, if a text has only a few contexts there will be a higher correlation between these two measures. The nodes are then colored according to the community they belong to. The nodes are considered to belong to the same community if they are more densely connected together than with the rest of the network. Modularity iterative algorithm embedded in Gephi does the job of finding these communities.
As the final step, the graph is visualized usingSigma.Jslibrary by Alexis Jacomy, which also applies Force Atlas layout, which pushes the nodes that belong to the same community together and pushes those communities apart in order to provide a more visually readable image of the graph.







The resulting graph image of the text can be used to get a quick visual representation of the main topics and influential keywords inside a text. According to studies in cognitive science, such visual maps can aid text comprehension tremendously. As the next step, the users can also click on the nodes within the graph. Then the parts of the text which have the term selected will be shown on the left-hand side. Like this the users can quickly skim through the text selecting the parts that are most relevant to their interests. The organization of network interface where the more prominent topics and keywords are visually emphasized allows this non-linear reading to occur in the way, which is still related to the original narrative. This is the difference from the tag clouds: the terms are organized according to how they are related to one another and the local structures within the text are made visible too. Tag clouds simply show the most frequently mentioned words ranging them in alphabetical order.



Our future plans are to make Textexture available through an API and to develop task-specific software based on it, such as the news reader, Facebook profile comparison tool, augmented search engine, etc. If you're interested in collaborating, please, do contact us!



Below is the above text represented through Textexture. By the way, as you can see, all the graphs can be made publicly available and embedded on any website with just a copy and paste code.

 

 

 
NOTICE: Please, use our new open source text network visualization and analysis tool InfraNodus. Textexture is no longer supported.
 
 
Nodes (Words): 100, Edges (Co-Occurrences): 440. Download processed GEXF file.
 

Most influential keywords in this text:
text    node    word    graph    filter: off

Most influential contexts in this text:
#0:   text    network    related    visually    filter: off
#1:   node    community    short    color    filter: off
#2:   word    show    scan    frequently    filter: off
#3:   graph    algorithm    textexture    embedded    filter: off

 

Link to this page:


Embed this graph:


Note: This embedded graph can be viewed by everyone when it's public: