<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
<channel>
<title>Neoformix</title>
<copyright>Copyright (c) 2006-2009 Jeff Clark. All rights reserved.</copyright>
<link>http://neoformix.com</link>
<description>Discovering and Illustrating Patterns in Data</description>
<language>en-us</language>
<lastBuildDate>Wed, 08 May 2013 13:50:02 GMT</lastBuildDate>

<item>
 <title>Visual Book Selector</title>
 <link>http://neoformix.com/2013/VisualBookSelector.html</link>
 <guid>http://neoformix.com/2013/VisualBookSelector.html</guid>
 <pubDate>Wed, 08 May 2013 12:00:00 GMT</pubDate>
 <description><![CDATA[
 
  <p>
    One common pattern I see in many interactive applications is to support a person who is selecting a few items
    from some larger set. Often these items have various characteristics that the person wants to use in some way
    to guide their selection process. The characteristics can be numeric quantities, dates, categories, or names of things. Showing all the items in a list and allowing the person to sort by one of the attributes is often a decent default solution.
	</p>

  <p>
    In other cases it's more useful to consider multiple attributes at a time during the selection process. Maybe you want items that are high in one attribute, low in another, and are from a particular category. Ideally the selection
    process should be one of exploration and successive refinement where various filtering criteria are adjusted until some small subset of items are defined and they can be investigated individually.
  </p>

  <p>
    I have built an example of this concept which I call the <a href="http://neoformix.com/Projects/BookSelector/">Visual Book Selector</a>. The books are directly represented with small circles and filters can be applied to progressively
    exclude books by various criteria. The filters are depicted visually as <i>gates</i> through which some of the items can pass and others cannot. The image below shows one possible configuration.
  </p>
 

  <center>
    <a href="http://neoformix.com/Projects/BookSelector/"><img  src="http://neoformix.com/2013/BookSelector1.png" width="712" height="537" class="shadowBox"></a>
  </center>

  <p>
    There are about 1000 books which start in the top segment of the display when no filters have been applied. In this
    example three of the category gates have been opened so books from those categories can pass through. The ones that don't pass this filter pile up near their closed gate which helps give some understanding of their distribution. The books that pass the first criteria encounter a second filter on the average rating of the book from Google Book reviews. This filter gate is set to only allow books having an average rating of at least 4.0 to pass through. The final gate
    does a pattern match on Author name and allows 4 books to the bottom which have passed all of the criteria.
  </p>

  <p>
    The best way to get a feel for it is to <a href="http://neoformix.com/Projects/BookSelector/">try out the Visual Book Selector</a> yourself. You can use the dropdown selectors on the left of each segment barrier to choose different criteria on which to filter. Hover over a book to see details and click on it's circle to visit the corresponding Google Books page.
  </p>

  <p>
    The list of books and their categories comes from the 2009 article in the Guardian <a href="http://www.guardian.co.uk/books/2009/jan/23/bestbooks-fiction">1000 novels everyone must read: the definitive list</a>. The other data was gathered from <a href="http://books.google.com/">Google Books</a>.
  </p>

  <p>
    I should also note that an excellent solution to this multi-attribute selection/exploration problem posed here is the <a href="http://moritz.stefaner.eu/projects/elastic-lists/">Elastic Lists</a> concept by <a href="http://moritz.stefaner.eu/">Moritz Stefaner</a>. It supports what's called Facet Browsing and enhances it with the visualization of proportions and distributions as well as animated transitions.
  </p>


 ]]></description>
</item>
<item>
 <title>Star Wars Movie Fingerprints</title>
 <link>http://neoformix.com/2013/StarWarsFingerprints.html</link>
 <guid>http://neoformix.com/2013/StarWarsFingerprints.html</guid>
 <pubDate>Wed, 27 Mar 2013 11:35:00 GMT</pubDate>
 <description><![CDATA[
 
	
	<p>
		Recently YouTube had a <a href="http://www.youtube.com/watch?feature=player_embedded&v=smdMh3Ew6IU">video that showed all six Star Wars movies at once</a>. They were placed in a 2 by 3 matrix and had an audio track of all the movies superimposed. It was an interesting experiment that has since been removed based on copyright grounds. Before it was removed I was able to do some simple analysis on the video and extract some details of the individual episodes of the Star Wars series. 
	</p>

	<p>
		Basically, I produced something very similar to a classic work called <a href="http://brendandawes.com/projects/cinemaredux">Cinema Redux&trade;</a> by <a href="http://brendandawes.com/">Brendan Dawes</a>, done in 2004.	Each individual movie in the series was reduced to a collection of small snapshots taken at 1 second intervals. The snapshots are layed out 60 images per row so a row corresponds to a minute in the film. These 'fingerprint' images reveal some aspects of the film structure.
	</p>

	<p>
		Click on any of these images to see higher resolution versions.
	</p>

  <center>
  	<i>Episode I: The Phantom Menace</i>
  	<a href="http://www.flickr.com/photos/25045595@N03/8595283338/sizes/k/in/photostream/" title="StarWars1 by JeffNormanClark, on Flickr"><img  src="http://farm9.staticflickr.com/8519/8595283338_2b0fdd676b_b.jpg" width="728" height="1024" alt="StarWars1" class="shadowBox"></a>	
	</center>
	<br><br>
  <center>
  	<i>Episode II: Attack of the Clones</i>
  	<a href="http://www.flickr.com/photos/25045595@N03/8594203647/sizes/k/in/photostream/" title="StarWars2 by JeffNormanClark, on Flickr"><img  src="http://farm9.staticflickr.com/8085/8594203647_6e2383f4dc_b.jpg" width="728" height="1024" alt="StarWars2"></a>
	</center>
	<br><br>
	
  <center>
  	<i>Episode III: Revenge of the Sith</i>
  	<a href="http://www.flickr.com/photos/25045595@N03/8595298072/sizes/k/in/photostream/" title="StarWars3 by JeffNormanClark, on Flickr"><img  src="http://farm9.staticflickr.com/8391/8595298072_ce48933da1_b.jpg" width="728" height="1024" alt="StarWars3"></a>
	</center>
	<br><br>

  <center>
  	<i>Episode IV: A New Hope</i>
  	<a href="http://www.flickr.com/photos/25045595@N03/8594208887/sizes/k/in/photostream/" title="StarWars4 by JeffNormanClark, on Flickr"><img  src="http://farm9.staticflickr.com/8520/8594208887_9ed07819fd_b.jpg" width="728" height="1024" alt="StarWars4"></a>
	</center>
	<br><br>

  <center>
  	<i>Episode V: The Empire Strikes Back</i>
  	<a href="http://www.flickr.com/photos/25045595@N03/8595292710/sizes/k/in/photostream/" title="StarWars5 by JeffNormanClark, on Flickr"><img  src="http://farm9.staticflickr.com/8105/8595292710_af9b835468_b.jpg" width="728" height="1024" alt="StarWars5"></a>
	</center>
	<br><br>

  <center>
  	<i>Episode VI: Return of the Jedi</i>
  	<a href="http://www.flickr.com/photos/25045595@N03/8595288058/sizes/k/in/photostream/" title="StarWars6 by JeffNormanClark, on Flickr"><img  src="http://farm9.staticflickr.com/8233/8595288058_a107d2b848_b.jpg" width="728" height="1024" alt="StarWars6"></a> 
	</center>

	<p>
		I used some fairly simple code in <a href="http://processing.org">Processing</a> to analyze the video and create the output images.
	</p>


 ]]></description>
</item>
<item>
 <title>Obesity Slopegraph</title>
 <link>http://neoformix.com/2013/ObesitySlopegraph.html</link>
 <guid>http://neoformix.com/2013/ObesitySlopegraph.html</guid>
 <pubDate>Tue, 26 Feb 2013 19:40:00 GMT</pubDate>
 <description><![CDATA[
 
	<p>
		Last week the wonderful <a href="http://www.guardian.co.uk/news/datablog">Guardian Datablog</a> published an interesting post called <a href="http://www.guardian.co.uk/news/datablog/interactive/2013/feb/19/obesity-map-of-world-weight">Obesity worldwide: the map of the world's weight</a>. It contains a map that shows with
		color the rates of obesity around the world. The accompanying chart gives data for
		different time frames and for both male and female which you can select and view
		on the map. When I saw the chart I immediately thought of a number of interesting questions that could not be easily answered with the map or chart.
		<ol>
			<li>What is the trend over time?</li>
			<li>Do these trends exist worldwide?</li>
			<li>Which countries are exceptions to the trend?</li>
			<li>Which countries have the highest or lowest rates of obesity?</li>
			<li>Are there large gender-based differences in obesity rates in various countries?</li>
		</ol>
	</p>

	<p>
		Much of my past work has been driven by personal curiousity. That, together with my own background in science, have shaped my work such that most of it has been exploratory in nature. Recently I have been thinking more about the storytelling or communicative aspect of data visualization. This has been triggered by my admiration for the amazing work of the <a href="https://twitter.com/nytgraphics">New York Times Graphics Department</a>, and the writings of <a href="http://www.thefunctionalart.com/">Alberto Cairo</a>, <a href="http://eagereyes.org/">Robert Kosara</a>, <a href="http://www.visualisingdata.com/">Andy Kirk</a>, and <a href="http://jonathanstray.com/">Jonathan Stray</a>.
	</p>

	<p>
		I decided try and build an <a href="http://neoformix.com/Projects/ObesitySlope/">interactive visualization</a> that helped answer the questions above. I also tried to build something that explicitly highlighted some of the more
		interesting aspects of the data without sacrificing freeform exploration. I settled on
		using a <a href="http://charliepark.org/slopegraphs/">Slopegraph</a> which was first described by <a href="http://www.edwardtufte.com/tufte/index">Edward Tufte</a> and is featured on the cover of Cairo's excellent book <a href="http://www.peachpit.com/store/functional-art-an-introduction-to-information-graphics-9780321834737">The Functional Art</a>.
	</p>

	<p>
		This first image shows the trend for male obesity organized by continent. It's a difficult problem to show labels for so many countries along one axis so I tried to alleviate it by letting the user expand or hide countries by continent group. In this case 'North America' is expanded to show its' individual countries. Labels are only shown if they don't overlap with others. The largest countries by population are placed first.
	</p>

  <center>
		<img  src="http://neoformix.com/2013/Obesity1.jpg" class="shadowBox">
	</center>

	<p>
		Individual country lines can be clicked on to emphasize them with colour.
	</p>

  <center>
		<img  src="http://neoformix.com/2013/Obesity2.jpg" class="shadowBox">
	</center>

	<p>
		The third example shown below charts female values on the left against male values on the right in order to emphasize gender differences.
	</p>

  <center>
		<img  src="http://neoformix.com/2013/Obesity3.jpg" class="shadowBox">
	</center>

	<p>
		 The <a href="http://neoformix.com/Projects/ObesitySlope/">interactive visualization</a> includes a 'stepper' that takes the user through four different views. This helps introduce functionality gradually as well as serving to emphasize important patterns in the data.
	</p>

	<p>
		In addition to the people and organizations mentioned above I would like to acknowledge the people behind <a href="http://processing.org">Processing</a> and <a href="http://processingjs.org">Processing JS</a> which was used to build the application. The code for the dashed lines comes from <a href="http://processing.org/discourse/beta/num_1202486379.html">J David Eisenberg</a>. Thanks!
	</p>


 ]]></description>
</item>
<item>
 <title>Neoformix Site Redesign</title>
 <link>http://neoformix.com/2013/SiteRedesign.html</link>
 <guid>http://neoformix.com/2013/SiteRedesign.html</guid>
 <pubDate>Tue, 19 Feb 2013 11:10:00 GMT</pubDate>
 <description><![CDATA[
 
	<p>
		In 2006, I started this blog as an outlet for my creative personal work as well as to gather in one place references to interesting work by other people. Since then, Neoformix has grown into a full-time business for me specializing in the development of custom data visualizations. I have just spent some time	giving the website it's first facelift in 7 years. I hope you like it!
	</p>

<!-- 	<p>
		I don't often need to highlight the work of other people any more because of the rise of many excellent sites like <a href="http://flowingdata.com">Flowingdata</a>, <a href="http://www.visualisingdata.com/">Visualising Data</a>, and <a href="http://visualizing.org">visualizing.org</a> that do an extraordinary job of curating the best content available.
	</p>
 -->
  <center>
		<img  src="http://neoformix.com/2013/SiteRedesign.png" class="shadowBox" width="500" height="430">
	</center>

	<p>
		I've tried to simplify the design and emphasize that Neoformix is a business
		by designing a main page that highlights some projects and moving the blog to a secondary page. Thanks to <a href="http://twitter.github.com/bootstrap/">Twitter Bootstrap</a> for a powerful front-end framework which I made use of in the redesign.
	</p>


 ]]></description>
</item>
<item>
 <title>Word Hearts Updated</title>
 <link>http://neoformix.com/2013/WordHeartsUpdated.html</link>
 <guid>http://neoformix.com/2013/WordHeartsUpdated.html</guid>
 <pubDate>Tue, 05 Feb 2013 13:05:00 GMT</pubDate>
 <description><![CDATA[
 
	<p>
		About five years ago I posted a simple little application called <a href="http://neoformix.com/2008/WordHearts.html">Word Hearts</a>
		which lets you fill a heart shape with words. Last year it was the most visited page on my site despite the fact
		that it was still a java applet based application which many modern browsers won't render. I have updated
		this tool to use ProcessingJS so it runs well in modern browsers. There is also enhanced functionality like:
		<ul>
			<li>You can fill circles, diamonds, stars, and squares as well as the original heart shape</li>
			<li>There are more fonts to choose from</li>
			<li>You can easily use small symbols like hearts, happy faces etc., in your list of words</li>
			<li>A nice color picker</li>
			<li>Word orientation options</li>
			<li>Vary the word colors so it looks more interesting</li>
			<li>Save your image</li>
		</ul>
	</p>

	<p>
		Here are a couple of examples of what you can do:
	</p>


  <center>
		<img  src="http://neoformix.com/2013/wordHearts1.png" border="1" width="468" height="519"><br>
		<img  src="http://neoformix.com/2013/wordHearts2.png" border="1" width="468" height="519"><br>
	</center>

	<p>
		Launch the interactive version of <a href="http://www.neoformix.com/Projects/WordHearts/index.html">Word Hearts</a> to try it out.
	</p>

	<p>
		This was created with <a href="http://processingjs.org">Processing JS</a> and also uses the
		 <a href="http://jscolor.com/">JSColor color picker</a> and the <a href="https://github.com/CD1212/jQuery-Font-Chooser">JQuery Font Chooser</a>.
 		Thank you!
	</p>


 ]]></description>
</item>
<item>
 <title>Grimm's Fairy Tale Metrics</title>
 <link>http://neoformix.com/2013/GrimmStoryMetrics.html</link>
 <guid>http://neoformix.com/2013/GrimmStoryMetrics.html</guid>
 <pubDate>Thu, 31 Jan 2013 16:05:00 GMT</pubDate>
 <description><![CDATA[
 
	<p>
		I have built another little digital humanities project based on
		the text of the 62 stories in <a href="http://en.wikipedia.org/wiki/Grimms'_Fairy_Tales">Grimm's Fairy Tales</a>.
		This one is called <a href="http://neoformix.com/Projects/GrimmsMetrics/">Grimm's Story Metrics</a> and presents an interactive matrix of stories together with various metrics calculated from their text.
		You can click on a column to sort by that data, click again to reverse the direction, and click on a story name
		to open it in another window.
		The image below shows the stories sorted by the 'Royalty' metric which indicates, as you would expect, how many
		references there are to words related to the topic of royalty. Click on the image to go to the interactive tool.
	</p>

  <center>
		<a href="http://neoformix.com/Projects/GrimmsMetrics/"><img  src="http://neoformix.com/2013/GrimmsMetrics1_720.png" border="1" width="720" height="423"></a><br>
	</center>

	<p>
		Hovering over any of the bars shows details
		about that particular measurement. Most of the metrics, like 'Royalty', are based on topics and the details
		shown are the words characteristic of that topic used in the story. So, for example, the details for 'Royalty'
		in the 'Frog-Prince' are <i>princess, prince, king, kingdom</i> which are listed in frequency order. These topical
		metrics are normalized based on total words in the story so longer stories have no scoring advantage.
	</p>

	<p>
		The 'Lexical Diversity' is a ratio of the number of unique words in the story to the total words. These stories
		are fairly short and you can observe a rough inverse relationship between 'Story Length' and 'Lexical Diversity'.
		'Clever Hans' is an outlier in this relationship. If you examine the <a href="http://neoformix.com/Projects/GrimmsExplorer/stories.html#link2H_4_0044">text</a> for this story you'll see that there
		is a great deal of repitition.
	</p>

	<p>
		This was created with <a href="http://processingjs.org">Processing JS</a>.
		The text analyzed is the English translation by Edgar Taylor and Marian Edwardes available at <a href="http://www.gutenberg.org/cache/epub/2591/pg2591.txt">Project Gutenberg</a>.
 		Thank you!
	</p>


 ]]></description>
</item>
<item>
 <title>Les Miserables Word Graph</title>
 <link>http://neoformix.com/2013/LesMisWordGraph.html</link>
 <guid>http://neoformix.com/2013/LesMisWordGraph.html</guid>
 <pubDate>Fri, 25 Jan 2013 10:25:00 GMT</pubDate>
 <description><![CDATA[
 
	<p>
		Here is a word graph for the <a href="http://www.gutenberg.org/cache/epub/135/pg135.txt">text</a> of 
		the novel <a href="http://en.wikipedia.org/wiki/Les_Mis%C3%A9rables">Les Miserables</a> by Victor Hugo.
  	Click on the image to see a large (4 MB) version which makes all the words legible.
  </p>

  
  <center>
		<a href="http://www.neoformix.com/2013/LesMisWordGraph1_4000.png"><img  src="http://neoformix.com/2013/LesMisWordGraph1_700.png" border="1" width="700" height="672"></a><br>
	</center>

	<p>
		Area of the words reflects frequency in the text. The top three most similar words are considered for connections
		with the word similarity metric defined by collocation within the text. The outer ring of words only have one
		weak connection to another word in the graph.
	</p>


 ]]></description>
</item>
<item>
 <title>Grimm Fairy Tale Browser</title>
 <link>http://neoformix.com/2013/GrimmBrowser.html</link>
 <guid>http://neoformix.com/2013/GrimmBrowser.html</guid>
 <pubDate>Tue, 22 Jan 2013 11:15:00 GMT</pubDate>
 <description><![CDATA[
 
	<p>
		My previous post on the <a href="http://neoformix.com/2013/GrimmNetwork.html">Grimm's Fairy Tale Network</a>
		showed a graph illustrating the strongest connections between the various stories. I used a few techniques
		to try and prevent the usual mess of connections that often obscure the relationships of interest.
	</p>

	<p>
		Another way of tackling graphs with lots of connections is to only show a small portion of the graph at a time
		and use interaction to provide navigation. This lets you browse around a complex network of nodes and relations
		and repeatedly get views centered on a node of interest. I've created an example of this for the Grimm's fairy tale data
		which I call the <a href="http://neoformix.com/Projects/GrimmsExplorer/">Grimm Fairy Tale Connection Browser</a>.
	</p>

	<p>
		The image below shows the connections to the story 'Little Red Riding Hood'. The larger circles are stories
		and the smaller ones represent key words in the collection. The inner ring shows the words and stories
		closely connected to the story of interest. The outer ring gives the related stories and words that are
		related but with less strength. You can click on any story or word to make it the new focus node.
		Click on the image below to launch the interactive version.
	</p>

  <center>
		<a href="http://neoformix.com/Projects/GrimmsExplorer/#/Story_21"><img  src="http://neoformix.com/2013/GrimmsBrowser1.jpg" border="1"></a><br>
	</center>

	<p>
		This second example shows the stories and other words highly related to the word 'wolf'. The interactive tool
		shows the <a href="http://www.gutenberg.org/files/2591/2591-h/2591-h.htm">Gutenberg</a> version of the stories in a panel on the right.
		When a new story is made the central focus of the visualization the right panel shows the story text.
	</p>
 
  <center>
		<a href="http://neoformix.com/Projects/GrimmsExplorer/#/wolf"><img  src="http://neoformix.com/2013/GrimmsBrowser2.jpg" border="1"></a><br>
	</center>

	<p>
		This was created with <a href="http://processingjs.org">Processing JS</a>.
	</p>


 ]]></description>
</item>
<item>
 <title>Grimm's Fairy Tale Network</title>
 <link>http://neoformix.com/2013/GrimmNetwork.html</link>
 <guid>http://neoformix.com/2013/GrimmNetwork.html</guid>
 <pubDate>Tue, 15 Jan 2013 21:55:00 GMT</pubDate>
 <description><![CDATA[
 
	<p>
		I have had some fun playing around analyzing the text of the stories in <a href="http://en.wikipedia.org/wiki/Grimms'_Fairy_Tales">Grimm's Fairy Tales</a>.
    There are 62 stories in this set and they contain many popular tales such as Little Red Riding Hood, Snow White, and Rapunzel.
		The text analyzed is the English translation by Edgar Taylor and Marian Edwardes available at <a href="http://www.gutenberg.org/cache/epub/2591/pg2591.txt">Project Gutenberg</a>.
	</p>

	<h3>Story Connections</h3>
  <p>
  	The graphic below is a simple network showing which stories are connected through the use of a common vocabulary.
 		There are three different strengths of connection shown and I've tried to minimize the usual 'hairball' nature
 		of these types of diagrams by only showing the top three connections for a story. Some stories will have more
 		than three links because the link meets the top-three threshold for the story on the other end of the link.
 		The shade of blue simply indicates the number of connections for that story - the darker the shade the more connections.
  	Click on the image to see a larger version.
	</p>
  
  <center>
		<a href="http://www.neoformix.com/2013/Grimm4_e3.jpg"><img  src="http://neoformix.com/2013/Grimm4_e3_s700.jpg" border="1" width="700" height="553"></a><br>
	</center>

	<p>
		The diagram shows in the upper-right corner for example that 'Little Red Riding Hood' is strongly linked to 'The Wolf and the Seven Little Kids'.
		My analysis shows that the strength of this connection is due to them both using words like wolf, stones, door, belly, scissors, drowned, and devour.
	</p>


 ]]></description>
</item>
<item>
 <title>Novel Views: Les Miserables</title>
 <link>http://neoformix.com/2013/NovelViews.html</link>
 <guid>http://neoformix.com/2013/NovelViews.html</guid>
 <pubDate>Tue, 08 Jan 2013 13:50:00 GMT</pubDate>
 <description><![CDATA[
 
	<p>
    The project 'Novel Views' consists of a series of visualizations
    of the novel <a href="http://en.wikipedia.org/wiki/Les_Mis%C3%A9rables">Les Miserables</a> by Victor Hugo. The
    text analyzed is the English translation by Isabel F. Hapgood available at <a href="http://www.gutenberg.org/cache/epub/135/pg135.txt">Project Gutenberg</a>.
	</p>

	<h3>Character Mentions</h3>
  <p>
	  This graphic shows where the names of the primary characters
	  are mentioned within the text. Click on any of these images to see larger versions.
	</p>
  
  <center>
		<a href="http://www.neoformix.com/2013/NovelViews3_l.png"><img  src="http://neoformix.com/2013/NovelViews3_m.png" border="1" width="700" height="400"></a><br>
	</center>

	<p>
		Characters are listed from top to bottom in their order of appearance.
	  The horizontal space is segmented into the 5 volumes of the novel.
	  Each volume is subdivided further with a faint line indicating the various books and, finally,
	  small rectangles indicate the chapters within the books.
	  In the 5 volumes there are a total of 48 books and 365 chapters.
	  The height of the small rectangles indicate how frequently that character is mentioned in that particular chapter.
	</p>

	<h3>Radial Word Connections</h3>
  <p>
		A word used in multiple places in a text can be interpreted as a connection between those locations.
		Depending on the word itself the connection could be in terms of character, setting, activity, mood, or other aspects of the text.
		This graphic shows a number of these word connections.
	</p>

  <center>
		<a href="http://www.neoformix.com/2013/NovelViews14_l.png"><img  src="http://neoformix.com/2013/NovelViews14_m.png" border="1" width="700" height="400"></a><br>
	</center>

	<p>
		The 365 chapters of the text are shown with small segments on the inner ring of the circle with the first chapter appearing at the top and proceeding clockwise from there.
		The outer ring shows how the chapters are grouped into books of the novel and the book titles are shown as well.
		The words in the middle are connected using lines of the same color to the chapters where they are used.
		The edge bundling technique together with the Volume - Book - Chapter hierarchy of the text are
		used so the patterns of connections are more easily revealed. 
  </p>
  
<br><a href="http://neoformix.com/2013/NovelViews.html#SummaryEnd">(More...)</a>

 ]]></description>
</item>
<item>
 <title>Delaunay Images II</title>
 <link>http://neoformix.com/2012/DelaunayImages2.html</link>
 <guid>http://neoformix.com/2012/DelaunayImages2.html</guid>
 <pubDate>Tue, 02 Oct 2012 10:50:00 GMT</pubDate>
 <description><![CDATA[
 
	<p>
		A few years back I played around with creating Delaunay Images as described <a href="http://neoformix.com/2009/VoronoiAndDelaunayImages.html">here</a>
		and <a href="http://neoformix.com/2009/MoreAbstractImages.html">here</a>. That work was inspired by
		these <a href="http://www.jonathanpuckey.com/projects/delaunay-raster/">Delaunay Images</a> created by <a href="http://www.jonathanpuckey.com">Jonathan Puckey</a>.
	</p>

  <p>
  	The delaunay process involves creating a triangular mesh in order to construct a more abstract
  	version of a starting image based on some control points. In the past I either manually selected
  	the control points or chose them randomly. I just recently came across some
  	<a href="http://jsdo.it/akm2/xoYx">javascript code by 'atm2'</a> for creating
  	these types of images and discovered that it uses a more clever approach. Basically, edge detection
  	is done on the base image and the delaunay control points are chosen from points on the edges.
  	Using this idea as a starting point I modified the code a bit to make the triangles more transparent
  	as they decrease in size. This basically lets us create a triangularized abstract version of an image
  	while letting the details of the original show through in key areas. An example is below:
  </p>
  
  <center><img  src="http://neoformix.com/2012/delaunay_sharbat.png"></center>

  <p>
  	I really like the effect and it's completely automatic which opens up some interesting possibilities.
  	The original base image is by Steve McCurry and is of
		Sharbat Gula. A retrospective on her life done by National Geographic
		<a href="http://ngm.nationalgeographic.com/2002/04/afghan-girl/index-text">can be found here</a>.
  </p>
  

 ]]></description>
</item>
<item>
 <title>Ablaze</title>
 <link>http://neoformix.com/2012/Ablaze.html</link>
 <guid>http://neoformix.com/2012/Ablaze.html</guid>
 <pubDate>Thu, 20 Sep 2012 13:15:00 GMT</pubDate>
 <description><![CDATA[
 
  <p>
    I recently came across an interesting javascript tool to generate images based
    on connecting lines between pairs of moving invisible points if they come within a threshold distance. 
    It's called <a href="http://theorigin.net/ablazejs/">Ablaze</a> and it was
    created by <a href="http://pat.theorigin.net/">Patrick Gunderson</a>. It's got a bunch
    of options to give you some creative control over what gets produced.
  </p>
  
  <center><img  src="http://neoformix.com/2012/ablazejs1.png"></center>
  

 ]]></description>
</item>
<item>
 <title>Movement in Manhattan Video</title>
 <link>http://neoformix.com/2012/MovementInManhattanVideo.html</link>
 <guid>http://neoformix.com/2012/MovementInManhattanVideo.html</guid>
 <pubDate>Tue, 08 May 2012 07:20:00 GMT</pubDate>
 <description><![CDATA[
 
  <p>
    In my last post about visualizing <a href="http://neoformix.com/2012/MovementInManhattan.html">Movement in Manhattan</a> I mentioned that
    it would be interesting to explore a more direct view of the data by using an animation. I have created such a video based on
    a fresh collection of tweets from Monday, April 30th. I gathered new data because I realized that my previous data set
    was collected over the weekend and I suspected that a weekday might provide more obvious patterns.
    It compresses 24 hours of data into 1 minute of video. Here it is:
  </p>
  
  <iframe src="http://player.vimeo.com/video/41703644" width="640" height="480" frameborder="0" webkitAllowFullScreen mozallowfullscreen allowFullScreen></iframe>  
  
  <p>
    I was influenced by the <a href="http://crowdflow.net/2011/07/12/fireflies-hd/">'Fireflies'</a>
    video showing iPhone traces done by <a href="https://twitter.com/#!/michaelkreil">Michael Kreil</a>. In particular, I
    like the idea of using larger but more transparent graphics to represent the increased uncertainty when drawing interpolated
    locations. Basically, if a person tweets at location A and then again at location B ten minutes later the model I used
    assumes they moved at a constant speed in a straight line between those two events. This is an obviously crude approximation
    and leads to unrealistic paths in many cases. By increasing the transparency in between the two measured events it shows this
    uncertainty in a visual manner.
  </p>
  
  <p>
    Again, as I saw in the original version, the patterns of tweets, both moving and static are quite chaotic. You can
    easily see the rise and fall of tweets over the changing time of day and some local patterns that look interesting
    but the patterns are still a bit of a jumble.
  </p>
  
  <p>
    The geolocated tweets were collected with the library <a href="http://twitter4j.org/en/index.html">Twitter4J</a> which was used
    from code written in <a href="http://processing.org">Processing</a>. I used <a href="http://blog.blprnt.com/blog/blprnt/updated-quick-tutorial-processing-twitter">this tutorial</a> 
    created by <a href="https://twitter.com/#!/blprnt">Jer Thorp</a> to get started with the library. Code from this
    <a href="http://www.shiffman.net/itp/classes/nature/week06_s09/flowfield/">flow field sample</a> by <a href="https://twitter.com/#!/shiffman">Daniel Shiffman</a>
    was used as a starting point to create my flow maps. The background map is from <a href="http://www.openstreetmap.org">OpenStreetMap</a>.
    Thanks everyone!
  </p>
  

 ]]></description>
</item>
<item>
 <title>Movement in Manhattan</title>
 <link>http://neoformix.com/2012/MovementInManhattan.html</link>
 <guid>http://neoformix.com/2012/MovementInManhattan.html</guid>
 <pubDate>Wed, 18 Apr 2012 11:35:00 GMT</pubDate>
 <description><![CDATA[
   
  <p>
    Inspired by the beautiful and elegant <a href="http://hint.fm/wind/">Interactive Wind Map</a>
    created by <a href="https://twitter.com/#!/viegasf">Fernanda Viegas</a> and <a href="https://twitter.com/#!/wattenberg">Martin Wattenberg</a>
    I have begun to explore the flow of people within a city.
    An ideal dataset to do this would include
    the GPS traces from thousands of people wearing trackers for weeks as they go about their daily lives.
    Organizations such as <a href="http://crowdflow.net">crowdflow.net</a> and <a href="https://openpaths.cc/">OpenPaths</a>
    collect voluntarily donated data of this type and might be fruitful to explore. I decided, instead, to
    use geolocated tweets to try and see how the movement of people is affected by the urban landscape.
  </p>
 
  <p>
    The image below shows an area of Manhattan roughly from Houston Street north to 72nd Street which corresponded to the
    region with the most geolocated tweets that I collected. It includes Times Square, Grand Central Station, the Empire
    State Building, Rockefeller Center, the southern portion of Central Park, and many other well known landmarks. The
    blue and red markings are an attempt to show the flow of people based on the data.
  </p>
 
	<center>
		<img  src="http://neoformix.com/2012/nyc8All_flow_oL.png" border="0" width="719" height="971">
  </center>

  <p>
    Basically, tweets sent by the same
    person within a 4 hour time-window were used as samples of speed and direction. These samples were used to construct
    a vector field representing the average flow of people within the area. The vector field and total tweet density over
    the space were then used to simulate the movement of people. Particles, representing people, were released at locations
    where actual tweets were recorded and their subsequent movement was determined by the flow field. The particles start out
    blue and gradually change through purple to red over time so each trace shows the direction of movement. Locations where
    there is little movement will have blue dots or very short blue traces. Longer traces with more red show a greater
    speed at that point.
  </p>
  
  <p>
    The density and direction of the flow patterns seem reasonable but they do appear fairly chaotic - much more so
    than the patterns seen in wind flow for example. This makes sense for many reasons. One, people are much less
    deterministic than the molecules that make up the air. Secondly, the environment that they exist in is extremely complex.
    Also, statistically we are dealing with a much smaller sample size. In this case, roughly 34,000 geolocated tweets
    with only 9,600 path segments. If we had a million-times more data then the average patterns would be more clear.
    Another important factor is that this data was collected over a few days and so there may be clear patterns for specific
    times of day that are mixed together visually.
  </p>
  
  <p>
    I have produced three more images that separate out the data by time of day. This first one only uses data from 6-11 am.
    It does appear to be a bit simpler and shows a few interesting patterns but it is still fairly chaotic. There is a strong flow east out from Central Park
    near 65th Street. There is also a more scattered flow from the east into New York University near the bottom left.
  </p>
  
	<center>
		<img  src="http://neoformix.com/2012/nyc8m_flow_oL.png" border="0" width="719" height="971">
	</center>

  <p>
    The afternoon flow map shows a greater overall density indicating a greater number of locations from which people
    are tweeting. There also appears to be a strong convergence on the area of 14th Street - 4th Avenue.
  </p>
  
	<center>
		<img  src="http://neoformix.com/2012/nyc8a_flow_oL.png" border="0" width="719" height="971">
	</center>

  <p>
    The evening map is also quite busy with lots of small local patterns. There is heavy action between 50th and
    57th Streets. Comparing these three versions is easier with this <a href="http://www.flickr.com/photos/25045595@N03/6931225412/in/set-72157629815896423/lightbox/">Flickr lightbox version of the images.</a>
  </p>
  <p>
    Overall, there are lots of flows and some of them likely reflect real movement of people within Manhattan.
    Many others probably just reflect noisy data because the sample size is so small. It's difficult to distinguish
    between the two cases here. The technique itself might warrant further study with more data. Another interesting
    avenue to explore would be to more directly visualize the data with an animation like this <a href="http://crowdflow.net/2011/07/12/fireflies-hd/">'Fireflies'</a>
    video showing iPhone traces done by <a href="https://twitter.com/#!/michaelkreil">Michael Kreil</a>.
  </p>
  
	<center>
		<img  src="http://neoformix.com/2012/nyc8e_flow_oL.png" border="0" width="719" height="971">
	</center>
  
  <p>
    The geolocated tweets were collected with the library <a href="http://twitter4j.org/en/index.html">Twitter4J</a> which was used
    from code written in <a href="http://processing.org">Processing</a>. I used <a href="http://blog.blprnt.com/blog/blprnt/updated-quick-tutorial-processing-twitter">this tutorial</a> 
    created by <a href="https://twitter.com/#!/blprnt">Jer Thorp</a> to get started with the library. Code from this
    <a href="http://www.shiffman.net/itp/classes/nature/week06_s09/flowfield/">flow field sample</a> by <a href="https://twitter.com/#!/shiffman">Daniel Shiffman</a>
    was used as a starting point to create my flow maps. The background map is from <a href="http://www.openstreetmap.org">OpenStreetMap</a>.
    Thanks everyone!
  </p>
  

 ]]></description>
</item>
<item>
 <title>Datavis Subgroup Word Analysis</title>
 <link>http://neoformix.com/2012/DataVisFieldWords.html</link>
 <guid>http://neoformix.com/2012/DataVisFieldWords.html</guid>
 <pubDate>Mon, 05 Mar 2012 07:30:00 GMT</pubDate>
 <description><![CDATA[
   
  <p>
    This is Part 4 of a set of posts related to the analysis of the Data Visualization Field on Twitter. For context
    or more information you may want to read those other posts first. They are:
    <ol>
    <li><a href="http://neoformix.com/2012/DataVisField.html">The Data Visualization Field on Twitter</a></li>
    <li><a href="http://neoformix.com/2012/DataVisFieldSubGroups.html">Data Visualization Field Subgroups</a></li>
    <li><a href="http://neoformix.com/2012/DataVisFieldConnections.html">Datavis Blue-Red Connections</a></li>
    </ol>
  </p>

  <p>
    In the previous posts we have seen that there are two fairly cohesive subgroups of twitter accounts that emerged
    from our analysis of the original 1000 accounts. I've been calling them the 'blue' and the 'red'. They were
    determined by looking exclusively at the references to twitter IDs within the tweets that were sent.
  </p>

  <p>
    Presumably
    the fact that there are two fairly distinct groups would also be reflected in what they are discussing. I've done
    some analysis of the words used within the tweets for both groups. English stop words ('the' , 'and' , 'or', ... )
    and other words commonly found in tweets ('new', 'via', 'like', 'day', ...) were excluded. Word clouds definitely
    have their limitations but I believe they can be an effective way to get a qualitative feel for a body of text.
    I have used <a href="http://wordle.net">Wordle</a> to construct word clouds for the two groups.
  </p>
  
	<center>
		<img  src="http://neoformix.com/2012/DataVisBlueCloud.png" border="0" width="700" height="422">
    <br><br><br>
		<img  src="http://neoformix.com/2012/DataVisRedCloud.png" border="0" width="700" height="410">
	</center>
  
  <p>
    It's clear that the blue group tweets a lot about 'art', 'code', 'design', 'processing', 'project', 'app'
    and 'workshop'. The red group tweets about 'data', 'visualization', 'design', 'infographic', and 'visual'.
    There is some overlap for sure but it's clear that they emphasize different things in what they are talking about.
  </p>
  
  <p>
    Right from the very start I was calling the whole set of accounts the 'Data Visualization Field'. Of course, a more
    accurate description was that I was looking at the 'Set of Accounts on Twitter Connected Through Tweet Mentions from
    @moritz_stefaner, @datavis, @infosthetics, @wiederkehr, @FILWD, @janwillemtulp,
    @visualisingdata, @jcukier, @mccandelish, @flowingdata, @mslima, @blprnt, @pitchinteractiv, @bestiario140, @eagereyes, @feltron, @stamen, and @thewhyaxis'.  
    It doesn't exactly roll off the tongue. From looking at these word clouds it appears that the red group could
    reasonably be named 'The Data Visualization Field' and the blue group something like 'Computational Artists and Designers'.
  </p>
  
  <p>
    If we want to contrast these two groups more directly we can look for words that are used much more frequently
    in tweets of one group than the other. I've done this for words that met both an overall frequency threshold and an author support
    threshold - they were used by at least 10% of the group members. The bar charts show the frequency proportion.
    So, for example, in the large sample of tweets I looked at from both of the two groups if you count
    the number of times the word 'makerbot' was used then 99% of those instances were in tweets from people in the
    blue group.
  </p>

	<center>
		<img  src="http://neoformix.com/2012/TopBlueWords.png" border="0" width="425" height="835">
    <br><br><br>
		<img  src="http://neoformix.com/2012/TopRedWords.png" border="0" width="428" height="827">
	</center>
  
  <p>
    This shows even more clearly the different things that these two groups emphasize. 
  </p>
  

 ]]></description>
</item>
<item>
 <title>Datavis Blue-Red Connections</title>
 <link>http://neoformix.com/2012/DataVisFieldConnections.html</link>
 <guid>http://neoformix.com/2012/DataVisFieldConnections.html</guid>
 <pubDate>Fri, 02 Mar 2012 15:30:00 GMT</pubDate>
 <description><![CDATA[
 
  <p>
    The recent post on <a href="http://neoformix.com/2012/DataVisFieldSubGroups.html">Data Visualization Field Subgroups</a> had
    an interesting reaction on Twitter that I didn't expect. Many people that were placed in the 'red group' by the
    community detection algorithm in Gephi joked about being part of the 'team' and being happy to represent it and
    be grouped together with the others. <a href="http://www.datatelling.com/">Jen Lowe</a> lightheartedly
    suggested a <a href="https://twitter.com/#!/datatelling/status/174731939868188673">scrimmage at #eyeo</a> 
    between the red and blue. There was much less reaction from the 'blue group', likely because I'm embedded
    within the reds myself and so they likely paid more attention to my posts and the subsequent reaction on twitter.
  </p>
  
  <p>
    There does, indeed, seem to be two fairly cohesive groups of people here but I suspect there are very
    many connections between the groups as well. We can use some simple network analysis to get a feel for this.
    Here are a few statistics calculated on the blue and red groups only:
  </p>
  
    <table border="1" cellspacing="0" >
      <col width="200" />
      <col width="40" />
      <col width="40" />
    <thead>
      <tr>
        <th class="Characteristic-cell">Characteristic</th>
        <th class="Blue-cell">Blue</th>
        <th class="Red-cell">Red</th>
      </tr>
    </thead>
    <tbody>
      <tr class="firstRow">
        <td class="Characteristic-cell">Number of Nodes</td>
        <td class="Blue-cell">216</td>
        <td class="Red-cell">244</td>
      </tr>
      <tr>
        <td class="Characteristic-cell">Total In-Links</td>
        <td class="Blue-cell">6734</td>
        <td class="Red-cell">5712</td>
      </tr>
      <tr>
        <td class="Characteristic-cell">Total Out-links</td>
        <td class="Blue-cell">6070</td>
        <td class="Red-cell">6376</td>
      </tr>
      <tr>
        <td class="Characteristic-cell">Avg In-Links</td>
        <td class="Blue-cell">31.18</td>
        <td class="Red-cell">23.41</td>
      </tr>
      <tr>
        <td class="Characteristic-cell">Avg Out-Links</td>
        <td class="Blue-cell">28.1</td>
        <td class="Red-cell">26.13</td>
      </tr>
      <tr>
        <td class="Characteristic-cell">Total Intergroup links</td>
        <td class="Blue-cell">665</td>
        <td class="Red-cell">1329</td>
      </tr>
      <tr>
        <td class="Characteristic-cell">Total Intragroup links</td>
        <td class="Blue-cell">5405</td>
        <td class="Red-cell">5047</td>
      </tr>
      <tr class="lastRow">
        <td class="Characteristic-cell">Percent Intergroup links</td>
        <td class="Blue-cell">10.96%</td>
        <td class="Red-cell">20.84%</td>
      </tr>
    </tbody>
  </table>
  
  <p>
    Both groups are pretty similar in most respects. The primary difference is that blue group members
    have on average more incoming links and that the percentage of intergroup links going from someone
    in one group to someone in the other is roughly double for reds. Remember that a link from A to B
    means that A referenced B in a tweet through a reply, a retweet, or just mentioning them in some context.
    When considering just the links between these two groups the people in red are referring to the people in blue
    at twice the rate of the reverse.
  </p>
  
  <p>
    If you look at the graph showing both groups together (edges not drawn) it's clear that some
    nodes, for example blprnt and pitchinteraciv, are on the border between the groups which suggests they likely have a fair number
    of cross-group connections. 
  </p>
  
	<center>
    <a href="http://neoformix.com/2012/DataVisField1000BlueRed.pdf">
		<img  src="http://neoformix.com/2012/DataVisFieldBlueRed.png" border="1" width="700" height="1140">
    </a>
	</center>
  
  <p>
    By looking at the details of the connections and their strengths we can quantify the 'blueness' or 'redness'
    of any particular node. This indicates how embedded they are within their own group. We can also do this separately
    for both incoming and outgoing links but I'll keep it simple for now and show one value that reflects both
    types of links together. This first table shows the top blue accounts (by degree) sorted by how 'blue' they
    really are.
  </p>
  
  <table border="1" cellspacing="0" >
    <thead>
      <tr>
        <th class="Account-cell">Blue Account</th>
        <th class="Degree-cell">Degree</th>
        <th class="Blueness %-cell">Blueness %</th>
      </tr>
    </thead>
    <tbody>
      <tr class="firstRow">
        <td class="Account-cell">factoryfactory</td>
        <td class="Degree-cell">134</td>
        <td class="Blueness %-cell">99.03</td>
      </tr>
      <tr>
        <td class="Account-cell">kcimc</td>
        <td class="Degree-cell">166</td>
        <td class="Blueness %-cell">98.5</td>
      </tr>
      <tr>
        <td class="Account-cell">theowatson</td>
        <td class="Degree-cell">147</td>
        <td class="Blueness %-cell">98.39</td>
      </tr>
      <tr>
        <td class="Account-cell">shiffman</td>
        <td class="Degree-cell">136</td>
        <td class="Blueness %-cell">97.51</td>
      </tr>
      <tr>
        <td class="Account-cell">memotv</td>
        <td class="Degree-cell">149</td>
        <td class="Blueness %-cell">96.78</td>
      </tr>
      <tr>
        <td class="Account-cell">zachlieberman</td>
        <td class="Degree-cell">148</td>
        <td class="Blueness %-cell">96.38</td>
      </tr>
      <tr>
        <td class="Account-cell">flight404</td>
        <td class="Degree-cell">191</td>
        <td class="Blueness %-cell">93.69</td>
      </tr>
      <tr>
        <td class="Account-cell">reas</td>
        <td class="Degree-cell">231</td>
        <td class="Blueness %-cell">92.76</td>
      </tr>
      <tr>
        <td class="Account-cell">creativeapps</td>
        <td class="Degree-cell">232</td>
        <td class="Blueness %-cell">90.46</td>
      </tr>
      <tr>
        <td class="Account-cell">golan</td>
        <td class="Degree-cell">276</td>
        <td class="Blueness %-cell">88.57</td>
      </tr>
      <tr>
        <td class="Account-cell">mariuswatz</td>
        <td class="Degree-cell">249</td>
        <td class="Blueness %-cell">87.18</td>
      </tr>
      <tr>
        <td class="Account-cell">generatorx</td>
        <td class="Degree-cell">149</td>
        <td class="Blueness %-cell">86.99</td>
      </tr>
      <tr>
        <td class="Account-cell">aaronkoblin</td>
        <td class="Degree-cell">181</td>
        <td class="Blueness %-cell">85.62</td>
      </tr>
      <tr>
        <td class="Account-cell">seb_ly</td>
        <td class="Degree-cell">123</td>
        <td class="Blueness %-cell">84.42</td>
      </tr>
      <tr>
        <td class="Account-cell">cedrickiefer</td>
        <td class="Degree-cell">126</td>
        <td class="Blueness %-cell">84.18</td>
      </tr>
      <tr>
        <td class="Account-cell">lennyjpg</td>
        <td class="Degree-cell">135</td>
        <td class="Blueness %-cell">77.7</td>
      </tr>
      <tr>
        <td class="Account-cell">ben_fry</td>
        <td class="Degree-cell">207</td>
        <td class="Blueness %-cell">73.75</td>
      </tr>
      <tr>
        <td class="Account-cell">eyeofestival</td>
        <td class="Degree-cell">187</td>
        <td class="Blueness %-cell">73.19</td>
      </tr>
      <tr>
        <td class="Account-cell">blprnt</td>
        <td class="Degree-cell">309</td>
        <td class="Blueness %-cell">66.23</td>
      </tr>
      <tr class="lastRow">
        <td class="Account-cell">feltron</td>
        <td class="Degree-cell">132</td>
        <td class="Blueness %-cell">54.73</td>
      </tr>
    </tbody>
  </table>
  
  <p>
    You can see that feltron, blprnt, eyeofestival, and  ben_fry are all tending towards the red which
    matches what we see in the network graphic where they are on the border.
    This table below shows how 'blue' the top twitter IDs are that were placed in the red group. Again we
    see that some accounts had significant linkages to the blue group.
  </p>
  
  <table border="1" cellspacing="0" >
    <thead>
      <tr>
        <th class="Account-cell">Account</th>
        <th class="Degree-cell">Degree</th>
        <th class="Blueness %-cell">Blueness %</th>
      </tr>
    </thead>
    <tbody>
      <tr class="firstRow">
        <td class="Account-cell">pitchinteractiv</td>
        <td class="Degree-cell">165</td>
        <td class="Blueness %-cell">35.48</td>
      </tr>
      <tr>
        <td class="Account-cell">moritz_stefaner</td>
        <td class="Degree-cell">326</td>
        <td class="Blueness %-cell">24.34</td>
      </tr>
      <tr>
        <td class="Account-cell">jeffclark</td>
        <td class="Degree-cell">163</td>
        <td class="Blueness %-cell">18.27</td>
      </tr>
      <tr>
        <td class="Account-cell">janwillemtulp</td>
        <td class="Degree-cell">290</td>
        <td class="Blueness %-cell">18.25</td>
      </tr>
      <tr>
        <td class="Account-cell">driven_by_data</td>
        <td class="Degree-cell">198</td>
        <td class="Blueness %-cell">17.71</td>
      </tr>
      <tr>
        <td class="Account-cell">mslima</td>
        <td class="Degree-cell">146</td>
        <td class="Blueness %-cell">15.9</td>
      </tr>
      <tr>
        <td class="Account-cell">wiederkehr</td>
        <td class="Degree-cell">149</td>
        <td class="Blueness %-cell">14.48</td>
      </tr>
      <tr>
        <td class="Account-cell">visualizingorg</td>
        <td class="Degree-cell">142</td>
        <td class="Blueness %-cell">11.49</td>
      </tr>
      <tr>
        <td class="Account-cell">datavis</td>
        <td class="Degree-cell">180</td>
        <td class="Blueness %-cell">10.34</td>
      </tr>
      <tr>
        <td class="Account-cell">krees</td>
        <td class="Degree-cell">172</td>
        <td class="Blueness %-cell">7.98</td>
      </tr>
      <tr>
        <td class="Account-cell">mbostock</td>
        <td class="Degree-cell">154</td>
        <td class="Blueness %-cell">7.57</td>
      </tr>
      <tr>
        <td class="Account-cell">infosthetics</td>
        <td class="Degree-cell">243</td>
        <td class="Blueness %-cell">7.45</td>
      </tr>
      <tr>
        <td class="Account-cell">noahi</td>
        <td class="Degree-cell">133</td>
        <td class="Blueness %-cell">6.17</td>
      </tr>
      <tr>
        <td class="Account-cell">flowingdata</td>
        <td class="Degree-cell">244</td>
        <td class="Blueness %-cell">5.77</td>
      </tr>
      <tr>
        <td class="Account-cell">periscopic</td>
        <td class="Degree-cell">140</td>
        <td class="Blueness %-cell">4.66</td>
      </tr>
      <tr>
        <td class="Account-cell">visualisingdata</td>
        <td class="Degree-cell">239</td>
        <td class="Blueness %-cell">2.46</td>
      </tr>
      <tr>
        <td class="Account-cell">eagereyes</td>
        <td class="Degree-cell">199</td>
        <td class="Blueness %-cell">1.44</td>
      </tr>
      <tr>
        <td class="Account-cell">albertocairo</td>
        <td class="Degree-cell">138</td>
        <td class="Blueness %-cell">1.36</td>
      </tr>
      <tr>
        <td class="Account-cell">jcukier</td>
        <td class="Degree-cell">204</td>
        <td class="Blueness %-cell">0.8</td>
      </tr>
      <tr class="lastRow">
        <td class="Account-cell">filwd</td>
        <td class="Degree-cell">163</td>
        <td class="Blueness %-cell">0.44</td>
      </tr>
    </tbody>
  </table>  
    

 ]]></description>
</item>
<item>
 <title>Data Visualization Field Subgroups</title>
 <link>http://neoformix.com/2012/DataVisFieldSubGroups.html</link>
 <guid>http://neoformix.com/2012/DataVisFieldSubGroups.html</guid>
 <pubDate>Tue, 28 Feb 2012 15:30:00 GMT</pubDate>
 <description><![CDATA[
 
  <p>
    There was some interesting discussion yesterday on Twitter about my post on
    the <a href="http://neoformix.com/2012/DataVisField.html">Data Visualization Field on Twitter</a>.
    <a href="http://well-formed-data.net/">Moritz Stefaner</a> pointed out that he didn't see a big
    improvement over his VIZoSPHERE and a quite similar topology. Furthermore, he noted that if you rotate my version
    90 degrees counter-clockwise many of the primary nodes line up fairly closely with his. He's right, and it's
    something I missed noticing completely. It's not really surprising that an analysis of most of the
    same twitter accounts using a different connectedness metric would yield similar results. I do still
    feel the map based on tweet text account references is slightly better at the detailed local level but
    I have no objective evidence that this is the case.
  </p>
  
  <p>
    Another interesting thing I learned yesterday was that <a href="http://blogger.ghostweather.com/">Lynn Cherny</a>
    did an excellent analysis of Moritz's data back in September which is reported in
    <a href="http://blogger.ghostweather.com/2011/09/combing-through-infovis-twitter-network.html">Combing Through the Infovis Twitter Network Hairball</a>.
    She focused on the detection of sub-communities within the network using both <a href="http://gephi.org">Gephi</a>
    and <a href="http://networkx.lanl.gov/">NetworkX</a> and has some nice results.
  </p>

  <p>
    Following Lynn's lead I have spent some time looking at the communities within my data. Doing this analysis
    with Gephi yields subgroups that look like this:
  </p>
  
	<center>
    <a href="http://neoformix.com/2012/DataVisField1000_Groupings.pdf">
		<img  src="http://neoformix.com/2012/DataVisFieldGroups.png" border="1" width="720" height="746">
    </a>
	</center>
  
  <p>
    The modularity score was .356 which is slightly under the .4 boundary for significance. By visual
    inspection of the image above it seems clear that there are two coherent groups to the left and four other
    groups that are intermixed and less clearly defined. These two coherent groups correspond pretty well to what I saw by eye
    yesterday. The top-left blue group has people who focus on computational design, generative art, or design in general.
    The bottom-left red group, as I noted yesterday, seem focused more on the practical aspects of data visualization.
  </p>
  
  <p>
    Below is a map showing only the blue group. I've also shown the top 3% of edges as well.
    I wasn't able to emphasize the flows as much as I would have liked but you can see some of the
    stronger edges and their direction. One of the strongest relationships visible in this map goes from @eyeofestival
    to @blprnt which indicates that a relatively high fraction of the tweets sent by @eyeofestival mention @blprnt.
  </p>
  
	<center>
    <a href="http://neoformix.com/2012/DataVisField1000_Group1.pdf">
		<img  src="http://neoformix.com/2012/DataVisFieldGroup1.png" border="1" width="720" height="698">
    </a>
	</center>
  
  <p>
    Here is the map for the red group below.
    Note that you can click on any of these images to get PDF versions where you can zoom in or search
    for a particular account.
  </p>
  
	<center>
    <a href="http://neoformix.com/2012/DataVisField1000_Group2.pdf">
		<img  src="http://neoformix.com/2012/DataVisFieldGroup2.png" border="1" width="720" height="564">
    </a>
	</center>
  <br>
  
  

 ]]></description>
</item>
<item>
 <title>Data Visualization Field on Twitter</title>
 <link>http://neoformix.com/2012/DataVisField.html</link>
 <guid>http://neoformix.com/2012/DataVisField.html</guid>
 <pubDate>Sun, 26 Feb 2012 20:30:00 GMT</pubDate>
 <description><![CDATA[
 
  <p>
    I consider myself one small part of a community on Twitter that focuses on information visualization,
    computational design, and interaction design. Collectively we tweet about our personal work, highlight
    other work of quality or that has interesting characteristics, critique approaches or individual
    designs, discuss tools and techniques, and suggest interesting datasets or projects. I'm grateful
    to be connected with such an interesting group of people and I've learned a great deal from them.
  </p>
  
  <p>
    <a href="http://well-formed-data.net/">Moritz Stefaner</a> is an important part of this group and in July 2011 he created an interesting map of this
    community he calls <a href="http://well-formed-data.net/archives/642/the-vizosphere">The VIZoSPHERE</a>.
    Basically, he started from a set of 18 selected
    twitter accounts, found their friends and followers and included any twitter account that met a minimum
    criterion of connectedness. A small version of part of this map is below. Node sizes reflect the number of 
    followers within this community.
  </p>

	<center>
		<a href="http://well-formed-data.net/archives/642/the-vizosphere"><img  src="http://neoformix.com/2012/Vizosphere.png" border="1"></a>
	</center>
  
  <p>
    It's a fairly standard graph view of the network data and the sheer number of connections makes them extremely difficult
    to traverse. Like many such large network graphs the primary utility seems to come from seeing which nodes
    are largest and seeing which ones seem to be grouped together, presumably reflecting nodes that have 
    a similar set of connections to the rest of the network or strong connections between them. This can sometimes visually suggest sub-groups
    within the overall community.
  </p>
  
  <p>
    After stumbling across this work recently I decided to explore the same problem myself. Rather than
    rely on follower information for connectedness I decided to analyze the actual tweets sent and look
    for mentions of twitter IDs. These could be retweets, replies, or just references to someone in a tweet.
    For a given twitter account we are essentially looking at who they talk to or talk about. Unlike the binary
    nature of the follower connections we can also measure the strength of this connection by looking
    at how often one person mentions another.
  </p>
  
  <p>
    I started with the same set of accounts that Moritz used: @moritz_stefaner, @datavis, @infosthetics, @wiederkehr, @FILWD, @janwillemtulp,
    @visualisingdata, @jcukier, @mccandelish, @flowingdata, @mslima, @blprnt, @pitchinteractiv, @bestiario140, @eagereyes, @feltron, @stamen, and @thewhyaxis.
    I looked at the 1000 latest tweets (or as many as they had if they hadn't sent 1000) and found all the
    twitter accounts they mention. For each mentioned account I calculated its' support - the number of accounts
    in the original 18 that mentioned it and used that ranked list to enlarge my set to 50. The latest 1000
    tweets for this larger set were retrieved and analyzed in the same way to enlarge the community to 100.
    I repeated this once more and used tweets from these 100 accounts to finally get a list of the top 1000.
  </p>
  
  <p>
    The total number of tweets analyzed for these 1000 accounts was 821,407 and I used them to determine
    a directed connection strength between each pair of accounts. This connection data was loaded into
    <a href="https://gephi.org/">Gephi</a> which I used to produce the graph below.
  </p>
  
	<center>
		<img  src="http://neoformix.com/2012/DataVisField0_s.png" border="1" width="720" height="738">
	</center>
  <br>
  <p>
    For a searchable and zoomable version use the <a href="http://neoformix.com/2012/DataVisField1000_degree_2.pdf">PDF</a>.
  </p>
  <p>
    As in Moritz's VIZoSPHERE there were so many connections that I didn't think they provided any useful
    information that could be seen with the eye so I left them out. They are used to layout the nodes for each account and also the node
    sizes are determined by the degree - the number of edges coming into or out of the node. The bigger
    nodes can be read off from this graph - @blprnt, @moritz_stefaner, @flowingdata, @visualizingdata,
    @janwillemtulp, @infosthetics, @golan, @mariuswatz, @reas, @ben_fry, @brainpicker, @nytimes, @timoreilly. Many of
    these larger nodes are, unsurprisingly, the original seed accounts we started with.
  </p>
  
  <p>
    Looking at the details of which accounts are placed near each other seems to give reasonable
    results. @Eyeofestival is near @blprnt, @krees near @periscopic, and @mccandelish near @infobeautiful. 
    It's very likely that many nodes are placed near each other based on more global or indirect factors
    so there are still likely some surprising juxtapositions.
  </p>
  
  <p>
    Many of the initial seed accounts are in the lower left part
    of the diagram and seem to reflect a subgroup focused more on the practical aspects of data visualization.
    The top left accounts seem more to be in the area of computational design, generative art, or design
    in general. @Blprnt seems to lie between these 2 subgroups. The right side of the diagram seems to be more general media and data sources.
    I suspect that many of the accounts on the left side mention those on the right but the reverse is not
    true. In fact, I suspect that many of the accounts on the right side aren't really part of the community
    in that they don't strongly interact with it. They are sources but not contributors. It would be
    interesting to repeat my enlargement process from the original seed accounts with some minimum criterion for 
    two-way interaction.
  </p>
  
  <p>
    The nodes are colored based on the total number of incoming links which represent people in this community mentioning that account.
    The darker the color the more incoming links there are. So there are a lot of different people within this community referring to
    @blprnt, @flowingdata, @brainpicker and @nytimes for example. You can't extract much quantitative detail
    from a color range but it does give you a feel for which accounts are highly referenced. Note that
    the color is based on the absolute number of incoming links - not the proportion of incoming to total links. 
    That would be a more interesting measure but I couldn't easily map it to color with Gephi.
  </p>
  
  <p>
    This looks like an interesting view of the data and I'm curious to explore a few related variations.
    Note that prominence within this graphic is a fairly crude measure of overall contribution to
    the field of data visualization. Many key figures in the field, Stephen Few for example, don't
    use twitter and so aren't represented here even though his critiques have a huge impact and are discussed within the twittersphere.
    Many others, such as Ben Shneiderman (@benbendc) and Edward Tufte (@edwardtufte),
    do use twitter but not extensively and not to a level that reflects their value to the field. They
    do appear in this map but have very small bubbles.
  </p>

  

 ]]></description>
</item>
<item>
 <title>Einstein Word Portraits</title>
 <link>http://neoformix.com/2012/EinsteinWordPortraits.html</link>
 <guid>http://neoformix.com/2012/EinsteinWordPortraits.html</guid>
 <pubDate>Thu, 16 Feb 2012 10:39:00 GMT</pubDate>
 <description><![CDATA[
 
  <p>
    I have created many <a href="http://www.neoformix.com/2008/WordPictures.html">word portraits</a> in the past and have always
    limited myself for the sake of simplicity to completely horizontal or vertical words. My interest
    in word portraits has been re-ignited by a recent client project and I've started to
    play with allowing angled text.
  </p>
  
  <p>
    In this first example below the words are flat when near the horizontal middle and
    gradually turn to vertical at the edges. I also swap the orientation below the vertical middle.
  </p>
  
	<center>
		<img  src="http://neoformix.com/2012/EinsteinGenius2.png" border="0" width="720" height="720">
	</center>

  <p>
    In the next example the angle of the word is determined by the brightness level at that point in the image.
    White regions are flat and dark are vertical. This gives a reasonable contoured effect because the
    brightness levels in the image vary in a natural fashion.
  </p>
  
	<center>
		<img  src="http://neoformix.com/2012/EinsteinGenius1.png" border="0" width="720" height="720">
	</center>

  <p>
    For this last one the words are all angled towards a point on one of Einstein's eyes.
  </p>
  
	<center>
		<img  src="http://neoformix.com/2012/EinsteinGenius3.png" border="0" width="720" height="720">
	</center>
  

 ]]></description>
</item>
<item>
 <title>Spot</title>
 <link>http://neoformix.com/2012/IntroducingSpot.html</link>
 <guid>http://neoformix.com/2012/IntroducingSpot.html</guid>
 <pubDate>Thu, 12 Jan 2012 10:15:00 GMT</pubDate>
 <description><![CDATA[
 
  <p>
    <i>This post was modified on February 15th, 2012 to reflect changes in the software.</i>
  </p>
  <p>
    <a href="http://neoformix.com/spot">Spot</a> is an interactive real-time Twitter visualization that uses a particle
    metaphor to represent tweets. The tweet particles are called <i>spots</i> and get organized in various configurations
    to illustrate information about the topic of interest.
  </p>
  
  <p>
    Spot has an entry field at the lower-left corner where you can type any valid Twitter search query. The latest 200 tweets
    will be gathered and used for the visualization. Note that Twitter search results only go back about a week so a search
    for a rare topic may only return a few. When you enter a query the URL is changed so you can easily bookmark it or send it to someone.
    The query <a href="http://neoformix.com/spot/#/brainpicker">brainpicker</a> gives you a display something like this:
  </p>
  
	<center>
		<img  src="http://neoformix.com/2012/spot0.png" border="0" width="720" height="419">
	</center>

  <p>
    At the top left, next to the logo, are six icons to access the different views. The first is called Banner mode and is shown above.
    Basically, tweets that share a lot of the same words are grouped together and the top five groups are shown. Tweets are often grouped
    because they are retweets of the same original content but this doesn't have to be the case. They may be tweets from different
    people that don't even know each other but happen to be discussing the same thing. The intent is to show quickly the most
    popular things people are saying about a particular topic. Tweets that are more unique are placed in the phyllotaxy spiral
    to the right.
  </p>
    
  <p>
    All the tweet spots show an image of the sender and at any time can be clicked on to see the tweet details.
    Clicking on the text of an open tweet will show the original in another browser window. Click on the background or
    an open tweet spot to close it or you can directly click on another spot.    
  </p>
  
  <h3>The Different Views</h3>
  
  <p>
    Here is a complete list of the views and what they show:
  </p>
  
    <ol>
    <li>Banner View (speech icon) shows the top five groups of similar tweets</li><br>
    <li>Timeline View (watch icon) places tweets along a timeline based on when they were sent</li><br>
    <li>User View (person icon) shows a bar chart with the people sending the most tweets in the set</li><br>
    <li>Word View (Word Circle icon) directly shows word bubbles with tweets attracted to the words they contain</li><br>
    <li>Source View (Megaphone icon) a bar chart showing the tool used to send the tweets (or sometimes the news source)</li><br>
    <li>Group View (circles in circle icon) places tweets that share common words inside large circles</li><br>
    </ol>
  
  <p>
    The Word View, again for the query <a href="http://neoformix.com/spot/#/brainpicker">brainpicker</a>:
  </p>
  
 	<center>
		<img  src="http://neoformix.com/2012/spot2.png" border="0" width="720" height="446">
	</center>

  <br>
  
  <h3>User and Twitter List Queries</h3>
  
  <p>
    The string 'brainpicker' matches the wonderful twitter account by <a href="https://twitter.com/#!/brainpicker">Maria Popova</a> and
    the results shown above are mainly retweets of or discussions about the tweets she has sent. You can also do a search
    for <a href="http://neoformix.com/spot/#/@brainpicker">@brainpicker</a> including the <i>@</i> sign to see the latest
    tweets sent from that account. This uses the standard Twitter API to get the data and so can go back farther in time.
    The Word View for this query clearly shows the Brainpicker focus on books, reading, writing, art, and maps.
  </p>

 	<center>
		<img  src="http://neoformix.com/2012/spot3.png" border="0" width="720" height="446">
	</center>
  
  <p>
    You can also retrieve the latest tweets from a twitter list. Here is an example for a list I <a href="http://neoformix.com/2009/CreatingTopicalTwitterLists.html">created</a>
    by analyzing who was on various lists created about data visualization. In the search field enter <a href="http://neoformix.com/spot/#/@Top100in/datavis">@Top100in/datavis</a>
    and you should get something like this for the User View:
  </p>

 	<center>
		<img  src="http://neoformix.com/2012/spot4.png" border="0" width="720" height="513">
	</center>

  <br>
  <h3>Technology and Credits</h3>
  
  <p>
    I was inspired to create this when playing with the wonderful Twitter visualization called <a href="http://moritz.stefaner.eu/projects/revisit/">Revisit</a>
    by Moritz Stefaner. Another influence was the Stamen work on Digg swarm which is no longer active
    but there is a <a href="http://www.youtube.com/watch?v=vXVbxtfJBCk">video</a>. My academic background in physics
    makes it natural for me to think in terms of interacting particles.
  </p>
  
  <p>
    This application was created with the wonderful Processing.js which is the javascript-based extension of
    the Processing tool I have used in the past. Thanks to Ben Fry, Casey Reas, John Resig, David Humphrey and the other
    people in the Centre for Development of Open Technology at Seneca College. Thanks also to Jim Bumgardner for
    the excellent <a href="http://www.krazydad.com/tutorials/circles_js/">tutorial</a> on phyllotaxy spirals and to
    <a href="thenounproject.org">The Noun Project</a> for five of the icons. Thanks also of course to Twitter and all
    the people who fill it with great content!
  </p>
  
  <p>  
    Performance is pretty good with the Chrome browser, and decent in
    Firefox and Safari. It will not work in Internet Explorer (except perhaps the new IE 9). It seems to work reasonably
    well on the newer iPads although the search field is broken currently in that environment.
    The application will go out and get new tweets periodically. For popular queries the analysis and display of those
    tweets will often cause lagging to occur.
  </p>

  

 ]]></description>
</item>

</channel>
</rss>
