What I find most interesting is that you can quickly get an idea of how much conversation is in one of these discussions by the number of nodes that are colored (signifying users who have posted more than once).
A very nice feature to have would be to run topic modeling (say, your run-of-the-mill LDA) on the comments, and then colormap the nodes as to preserve the distances in the comment topic vector. This way you'd be able to see threadjacking, etc.
Wow, that's a great idea for next steps. This is the direction I'd love to take this work- getting into the comments and running an analysis. The comments range in size and some can have subtle humor- I wonder how that would affect the LDA.
The Reddit visualization I'd most like to see is one of votes/comments over time on highly partisan (not necessarily political) topics. When I was using Reddit I perceived distinct "tides" in those cases.
Comments would swing up and down by double and triple digits as early burying/back-patting is overcome by more moderate opinions as the post rises in exposure. Likewise, having a post linked on other large forums would result in new waves of like-minded moderation.
Are there any posts in particular you're thinking of? This would be a very compelling visualization. Haven't seen posts like this myself, but I imagine if I hang around /r/politics I'll see one.
>"Are there any posts in particular you're thinking of?"
No, I haven't frequented Reddit for a few years now. I expect /r/politics would have been a goldmine for this last fall. I felt it was most apparent on topics that can be construed to involve race.
Good stuff! I wonder if you could somehow classify the conversation trees using depth or branching factor or whatever to predict and/or discover hot topics. Like you said, AMAs tend to have a pretty obvious structure but maybe that could be extended to discover "controversial" threads -- or even people!
That's an awesome idea. A buddy was telling me to create a bot for it too, like if a conversation derails enough and focuses on one of the comments, it would post an image of the network having been derailed. On discovering hot topics- that makes sense- I bet there is a characteristic "engaging" type of network, regardless what exactly the content is.
It would be pretty interesting to see this applied to HN. I realize it would be more difficult because there isn't an API and pg is pretty harsh with the rate limiting.
Yea, I thought about this a bit. I've seen HN threads turn nasty when they have several layers of depth. Those kinds of conversations are discouraged too with the delay for response. Either way, you're right- it would be cool to see how those rules (both social and built in) affect the network structure.
I found it extremely beautiful, especially with the context you provided. Some large framed prints with the text in little gallery placards would make an interesting exhibit.
Also, you can see a sort of meta discussion here on the /r/dataisbeautiful subreddit: http://www.reddit.com/r/dataisbeautiful/comments/1jqz3f/redd... It's kind of hilarious because people leave threads of comments to create new branches.
What I find most interesting is that you can quickly get an idea of how much conversation is in one of these discussions by the number of nodes that are colored (signifying users who have posted more than once).