

Document Growth by Spidere
December 28th, 2007 3:57 AMI've been wanting to do some quantitative analysis of SF0 for a while now, and this task allowed me a framework to do just that. I've tried to include only the results I think are most interesting here in the main section (with additional analyses found in the comments on the additional graphs at the end), but there's still a lot of analysis here, so tread carefully.
Oh, and on a completely unrelated note, you should go vote for Doorhenge, if you haven't already. It embodies so many of the best things in SF0: collaboration, epic scale, creating something new and wonderful, and beautiful documentation. (Disclaimer: I am not a collaborator on Doorhenge) Anyway, enough with the plug; on to the task!
SF0 Growth
Growth in Players Joining
SF0 has had significant growth in players this era:

We can also look at a smoothed density estimate of the rate of joining over time, to get a clearer picture of when people are joining (though note that in these kernel density estimates, the sides are lower than they should be, because of the way I'm estimating--so joining at the beginning and end of the era was actually higher; I'll try to fix that in future analyses):

Jenny il saw this as I was working on it and mentioned that this looks very similar to a graph for social action used by grasroots effort groups--when young people are most likely to get involved in social action. Apparently it has the same time of rise and fall, which would seem to indicate that SF0's players have a similar demographic. I thought that was really interesting.
Some caveats here: I couldn't get good "joined on" dates for players, so I restricted the analysis to only players with photos--anyone without a photo is not really a significant part of SF0 yet, and I can treat the photo posting date as a "started on" date. I also had some data quality issues: I removed players with "0" as a photo posted date, and anyone with 0 score (there were 4 players with photos, but 0 score). There were also two anomalous groups of players which I removed: sock puppets created for the Bush election attempt and sock puppet philosophers. You can see the sock puppets in some of the additional graphs at the end, labeled as including sock puppet players.
It was also kind of interesting to take a look at player growth by geography. San Francisco was by far the most popular city, but there were a number of other cities which were interesting to look at. Chicago was number 2, and the growth in Pittsburgh and St. Paul was neat. You can see that for a number of cities (Brentwood, New York, Pittsburgh, St. Paul) the number of players jumped up by several players at once--which I think shows the strong social aspect to getting friends to join. Sadly, most of the city entries were blank, so I didn't get too much of interest here.



(One other thing discovered in this portion was that cities used to be entered in manually. For example, there were some 'san francisco' entries, which I combined with 'San Francisco'. But I think my favorite city was "1NTERNETS!!!LOL!!", which is where wilhelmfink apparently lives). Also, a number of the scores on the players page are not quite right (again, look at wilhelmfink, who has 648 points here, but only 10 on his main page)
We can also break this down by people's current level; see when all the current level 8 players joined, for example. Interestingly, as you break it down by level (level 2 or higher), you see that although the players of each level have grown during Glasnost, most of the players for that level had already joined SF0 during Impossible Exchange. Maybe part of that is the social factor (perhaps being known to people here encourages more votes), or being more practiced with the system.

If we turn this into a kernel density estimate for when people of each level joined, just for those joining in the Glasnost era, we do see that the higher-level players joined earlier, and the lower-level players joined more recently. This only makes sense, since those who jopined earlier have had more time to complete tasks and get votes. Flitworth, lara black, and The Revolutionary are notable outliers.

Growth in Task Completions
The number of task completions has also been steadily growing over the era:

It's a little bit difficult to see in the cumulative graph, so I also made a kernel density estimate for the rate of task completions over time. It's a little easier to see the initial rush of completions and slowdown to September here (though again, the beginning and end of the era are shown as lower than they should be, because of how I did the estimate; using kernels which are limited by the timespan would result in a more accurate picture at the ends)

When we put praxes posted and players joining on the same graph, we can see that there does appear to be a pattern of people joining, then performing tasks (there is a lull in people joining, followed by a lull in tasking; later a surge, followed shortly by a surge). So it seems like new players could be responsible for a lot of the tasks on the praxis. Of course, it could simply be that sf0's demographic is more active during these times, leading to more new players and more task completions. One interesting thing to look at might be whether this holds if we consider higher-class tasks (i.e., tasks which garner a certain number of votes). I'd also like to do some functional data analysis to confirm the theory that new players have an initial surge of tasking, which often falls off over time; it might also be interesting to see how the proportion of Player Photograph tasks changes over time.

SF0 Changes
Growth in collaboration
Over time, the average number of collaborators per task has been growing.

It may be worth examining to see how much of this is due to outliers (a few epic collaborations with many players). However, in the additional graphs, there is also a graph showing the growth in the proportion of tasks which are collaborations (rather than individual); this also seems to be growing.
Votes per task
The number of votes per task has been gradually growing over time; this may be due to the increasing number of players, or to more impressive tasks.

This does not necessarily indicate that it is now easier to get votes. In order to check this, I would need to see when the votes were cast for each task--we could then see if the pattern of votes over time, per task, has changed for more recent tasks. (For example, it could be that older tasks get more votes because they've been around longer; although a poisson regression shows the effect of time is negligible).
We can similarly note that the average number of comments per task has been growing; this, again, could be from the increasing number of players, or like the number of votes, could indicate people becoming more involved. We could check the player hypothesis by seeing if there is a relationship between the number of players and the average number of votes or comments per task.

Groups
The third section deals with groups--how they've grown over time, and how opportunities for each group grow as one increases in level. Some of this is to give some hard data when figuring out how the groups are working (though I trust Sean's pronouncement that groups in the next era will be fantastic)
The first part of this is an analysis of the number of tasks available to each group, as one increases in level.

I had a whole big speech prepared for this earlier, but now that the revolution is over, I'll just point out that the UofA does not have the most tasks available to it; in fact, the HC is the only group with a dominating (and statistically significant) lead over the others.
However, the UofA is dominant in terms of the number of players who choose it; this is not just holdover from Glasnost, either--new players joining during Glasnost consistently joined the UofA at a significantly higher rate than the other groups. The number of people, I think, is actually the main thing which sets UofA apart from the other groups. You can see the growth in each group's members over time:

One interesting note is that the graph is actually dominated by that highest line, the groupless, which is made up almost entirely of holdovers from Impossible Exchange who have not been active during the Glasnost Era. It also includes SPAR.
We can also look at a kernel density estimate of the rates of players joining each group (over time) during Glasnost, just to see when each group was getting new members:

Personal Growth
This final section deals with personal growth in SF0. Part of this is to see how my involvement with SF0 has grown; part of it is an attempt to calculate some more metrics (like those mentioned here), and part is an attempt to investigate voting and see what the right way of voting is.
The first piece was a graph showing my growth in points over time, breaking it up into points from votes (blue) and from the tasks themselves (red). Because this is a personal graph, I marked the timeline with my completed tasks instead of dates. I also added horizontal and vertical lines to indicate when I reached each level:

Some things I found most striking are the big jump in points from Trajectory of Desire (400 points for any task is really huge) and the fact that most of my points are due to votes rather than from the task points themselves.
Player Satisfaction Metrics
There were a number of other scores that I wanted to compute, and see how they changed over time, most of them arising from Ziggy C.'s Discussion Forum. First was the average number of points per task, which mainly grows as older tasks gain votes (though it dips at most later task postings, because when first posted, they bring the average down). Votes per task (in the graphs at the end) shows a similar pattern.

BU's idea of slugging percentage and On-base plus Slugging (OPS) were also interesting; I'll refer you to his comments for more details, but they're interesting metrics for player satisfaction and it's nice to see that mine do in fact grow over time.


It would also be great to normalize these metrics by the number of players (to try to take care of the increasing number of players over time) or by the number of collaborators on each task, but I haven't looked at that yet.
Voting
There were also some analyses I wanted to do related to votes cast and votes received. I've been mulling over the idea of the vote ratio, after zemaluco's post regarding the vote economy. In particular, the 1:1 ratio I think is not quite right, because of collaboration. That is, it would be impossible for everyone to have a 1:1 votes ratio because if you vote for a collaboration, each person on there gets a vote--and so even though you've cast one vote, they've each gained a vote (and so each have to vote to get closer to 1:1). So, by collaborating, what we are really doing is growing the vote economy, as it were. You can see this if you look at two graphs. The first is a plot of votes cast and votes received over time:

The second is the total votes contributed (when a vote is cast, counting each person who receives one) and votes received. As you can see from this, I have actually contributed significantly more votes to the economy than I have received myself.

Community Involvement
The final part of a player analysis, and to me the most potentially interesting, was to see how my involvement in the SF0 community has grown over time. So I plotted total votes cast, votes received, and texts posted.

I think it's neat to see all of these really start growing around August 22nd; this is when I started actually responding to texts with more texts, which I think is crucial to the community of SF0. It's also when I posted a series of completions, including A place you have never been, which is still one of my proudest completions so far.
Some other statistics on votes-per-day and tasks-per-day were also created; you can see graphs in the pictures below.
I also analyzed a few other players (with permission): Charlie Fish, YellowBear, and Burn Unit. Their analyses are in the pictures below (removed simply to keep this completion from going that much longer).
Final Notes
Even though there's a lot of analysis here already, I've barely scratched the surface of exploratory SF0 data analysis. There are many more things I want to do, including more rigorous testing of some of the ideas above and below, as well as new investigations and explorations. I'd love to do some more geographical analysis, if I could figure out how to better determine locations. I'd really like to take a look at functional data analysis on how tasks gain votes over time--is there a class for which many votes come after it's on the praxis main page? How do people tend to find and vote for tasks? I've also just realized that I actually could extract every player's task progression from the data I have, so could do functional data analysis of players to see how people's task posting changes over time (such as what affects when people tend to hit their stride, and what differentiates players of different levels).
Anyway, I love SF0, and I love data analysis. If you've worked your way through this whole proof, thank you. I hope you've found it interesting.
38 vote(s)
- Flitworth
- Charlie Fish
- rongo rongo
- Lincøln
- Fonne Tayne
- Burn Unit
- Terpsichore
- teucer
- YellowBear
- Lank
- JJason Recognition
- Sui Generis
- Sean Mahan
- JTony Loves Brains
- anna one
- lara black
- Stu
- GYØ Ben
- Magpie
- Darkaardvark
- Natalia Envy
- High Countess Emily
- help im a bear
- ananas
- Loki
- Jane Doe
- Adam
- The Revolutionary
- Jellybean of Thark
- susy derkins
- Tøm
- Jack
- The Vixen
- Optical Dave
- Blue
- Bex.
- Ben Yamiin
- Supine ⠮⡽⣪Rocket
Favorite of:
Terms
(none yet)35 comment(s)
This must've taken you ages.
What happened in September that led to such a drop-off of activity?
I'm pleased to see your comments about the 1:1 votes cast to votes received ratio being a misleading aim. Oddly enough, I posted some thoughts about that ratio earlier today here (scroll to bottom).
I'm proud that my personal stats show that I've crammed the most tasks and points and votes into the shortest period of time. I may not be the best SF0 player, but I am the most intense!
My hourly distribution of text-posting makes a little more sense when you realise I'm in a very different time zone...
This is awesome, ultraSenator Spidere. More like this.
This is great! The funny thing about data analysis is that the more you do, the more questions you come up with that lead to more analysis. I'm glad you did this.
Document Growth by Spidere
December 28th, 2007 3:57 AMDCØ time : 6:57 AM
Up all night? Or all week? Regardless, that's some damn drive, Sen(ex)ator the Junior. Well done.
Fish: my theory is September = Burning Man prep, event, recovery, plus new semesters at peoples colleges maybe?
Spidere: the 1:1 idea is originally mine, though I think Z has advanced it with a great fervor (could you link to his post, it might be good to know exactly what he said on the subject). I don't know that I mean it as an economic indicator. I just threw out an idea that as an individual's votes cast vs. votes received approaches 1:1 or "the closer it is to 1:1 the better it is for the game." I have no proof or even evidence for this theory. I also have thought about SF0 in economic terms as you've indicated, and particular about currency. My current favorite flogging horse is that the real currency of SF0 is collaboration(s). I feel there's some especially good reasoning for this suggested by your pointing out a single vote by one player is multiplied rapidly into votes for each participant in a collaboration. That's significant somehow in a quicksilvery way.
Oh that's interesting: note how closely the texts posted and votes received lines move with each other. The more one talks the more one gets votes? Probably not, but it's interestingly similar for 3/4 of the players you charted above.
(One other thing discovered in this portion was that cities used to be entered in manually. For example, there were some 'san francisco' entries, which I combined with 'San Francisco'. But I think my favorite city was "1NTERNETS!!!LOL!!", which is where wilhelmfink apparently lives). Also, a number of the scores on the players page are not quite right (again, look at wilhelmfink, who has 648 points here, but only 10 on his main page)
Can't this be explained by Impossible Exchange vs. Glasnost?
648 = total score, 10 = currently.
So L.A. is the fifth largest market in the game behind only San Francisco, Chicago, Oakland and Berkeley? And we have 20 players? Where are you guys? Come out of hiding already! I want new collaborators. I love the one I've got, but I want lots. I wanna be able to play hide 'n' go seek in a museum down here.
Plus you're totally a geek Spidere.
Fish: I think it really is true that the more one talks the more votes one gets. I at least - and almost certainly others - have a tendency to befriend SF0 players I interact with in any significant way. The ones who do more than just vote, of course, will have such significant interactions with more people, and thus get more friends.
And of course your friends all see your tasks on their updates page, so they can check them out and see if they're awesome enough to be worth a vote.
That is great work senator! I'm very impressed
Very impressive. Not sure I understand it all, but impressive none-the-less.
:) First, let me thank you all for responding and saying both such nice things and such interesting things. I realized, as I was going through my SF0 alerts, that while the votes and points are definitely good (and were especially enticing when I started), I have come to the point where the most exciting thing for me is when people comment on my tasks. Thank you.
As to time, zemaluco and Charlie Fish are both right; I have been working on this for a long time (it probably would have been indefinite, if it hadn't been for the era's ending--which is good; without a deadline, who knows how much more analysis I would have done before posting, months or years later) and I also spent several hours late into the night finishing it last night. It was really enjoyable to do (and got a great payoff in comments), but I think it may well have been the most effort I've put into an SF0 task so far.
Oh, and if anyone else is interested in having a personal analysis, let me know and give me an email address to send the results to. I'm pretty sure I can get the data without hitting the servers too badly (though I also recently made a large donation to SF0 to assuage my guilt--which, on a side note, made me notice that SF0 costs $200 a month, and last month they only got $33 in donations. My next project might be trying to put together donors to make sure SF0 is well taken care of.)
Now, some responses:
Flit: Your embroidered tasking hat, no less? :) I thank you.
Charlie: I quite like the idea of using activity times as a way of determining time zone--I was wondering how many people would notice that, as I don't think I even mentioned it in the main writeup. :) I thought I might use it for sock puppet detection as well (narrowing down who might be behind certain puppets), but realized that it's too easy to fake with low-volume sockpuppet postings.
rongo: Precisely. I hope to follow this up with another round sometime next era.
svn tsv (zemaluco): The Doorhenge plug is part of a continuing recent effort of mine...in my re-campaigning (taking a personal challenge to try to make it to triple Senator, and doing so to get to Level 8), I passed Doorhenge on the all-time votes trail. While I'm sure this will eventually correct itself, I find it kind of personally shaming to me, and so have been trying to spread the word (hoping to correct things before the end of the era).
Burn Unit/Charlie: I suspect the September dip is due to the demographics of SF0 (many students), as Jenny il suggested; I think more analysis (and probably more data) would be needed to see if this holds up, and we should probably try to find other plausible ideas as well.
Burn Unit: Zemaluco's comments on the 1:1 were mostly implicit in the rebukes to my Campaign Trail here, which I was hesitant to link to (due to the Doorhenge concern above). I also think you're on to something with collaboration: it is certainly the engine of growth if one looks at the vote economy, and I also think there's something more there...
Burn Unit and Peter Harmon: I do think that there's something to increased interaction leading to more awareness, and thereby to more votes...
svn tsv (zemaluco): You are completely right on the ImpEx vs. Glasnost explanation. I had missed the fact that players from ImpEx only had scores listed, but not points. I suspect that the manual city entries are from ImpEx as well.
Lincoln: While many people have Los Angeles listed, most of those are from the ImpEx era and currently only have 10 points (I believe Amsterdam had several inactive players as well, which surprised me). However, it might be possible to draw them out of retirement...
Also, you are correct: I am indeed.
Everyone: thank you again. :) Very much.
This is ridiculously awesome - we should work together to do more of this (and presumably more easily, with DB access, than whatever your hope-not-benevolently-hacking-into-MYSQL method was). More fun stats:
Total players: 2375
Active players (last two weeks): 352
Active players (today): 115
Percentage of players active: 14.82%
Total Score in the game: 417938 pts
Average Total Player Score: 175.97 pts
Current (this era) Score in the game: 258767 pts
Average Current Player Score: 108.95 pts
Total Score by active players: 264055 pts
Average player Score for active players: 750.16
Total # of tasks completed: 7420
Total # of completed tasks: 9221 (people who completed tasks)
Glasnost Tasks completed: 3703
Impossible Exchange Tasks completed: 3717
Average Score Per Completed Task: 45.32 pts
Avg completed tasks per player: 3.88
Collaboration percentage: 19.53%
Completed tasks by active players: 3876
Average Number of Completed tasks for active players: 11.01
Average Score per Active Player completed task: 68.13 pts
Wow. This is all totally fascinating. Where's that über-vote button again?
I love SF0, and I love data analysis.
amen, brother :)
woah! That's is so wonderful, although i am really glad that it's not my job to do that sort of thing. well done.
Wait wait wait wait wait wait. Glasnost Tasks completed: 3703, Impossible Exchange Tasks completed: 3717? We can't let Glasnot lose this way! Especially by only fourteen tasks! We can close that gap in 3 days, totally.
In order to do 14 tasks in 3 days, we all need to do a task that nobody has completed yet.
Ready?
Begin.
As a player who considers himself to live in both Minneapolis AND St. Paul, taking into consideration their adjacency, and the fact that players from both cities can easily task in either city (or neither city--see MN0's Nintend0), I'd propose that they be counted as one, rather than two, cities.
I'm working on Confuse a Mineral. Not sure how well it'll work out, but I'm trying. My tasking is severly limited to lack of camera access, but I think I can pull this one off.
I'm doing Eye Contact. Tomorrow.
Oh my god! This is amazing. I love data analysis. I had this strange fascination with AP Stats and analyzing people's trends. It's kind of God-like, you get to know every mathematical detail relating to SF0.
Very interesting, Spidere. Thanks for all that work, and for sharing it with us.
The distribution in events throughout the day and week are really interesting, and not what I would have imagined, either a priori or based on qualitative observations of the game.
I'm particularly surprised by all the structure in the time of player joinings by level plot. There seems to be a level 4/5 population that's very different from all the rest. Any sense of what's happening there? (I assume level 8 is dominated by a couple individuals.)
Surely London/Great Yarmouth/UK is one of the Top 8 non-SF cities.
We have 10ish players here in GY.
Yes- it's more MN0 than Minneapolis vs St. Paul. The cities run into each other in such a way that the changeover is mostly unnoticeable.
Spidere, it is for tasks like this that you get that many votes!!
The "points from votes/points from completion" (V/C) ratio on this task is already of 10.0 (150:15) at the time of this writing, UltraSenator. I propose that metric could be potentially useful, as an index of magnitude of awesomeness in the eyes of involved SF0 players.
I remember Loki proposing another metric on awesomeness (or was it amusement value?) somewhere. [edited] Found it. The L/E metric, of course!
(We will have you working on this task for a while yet, I am afraid, but that´s only because it rocks the place).
Holy cow, the V/C ratio on your Campaing Trail is 23.0 (345:15)! Whatever that means, it must go into the analysis, I think.
Collaboration adds value to tasks, as we all know, but there must be a metric to distinguish epicness done by a handful of collaborators vs epicness done by a larger group. Oh, well, maybe there musn´t ... :)
At the risk of sounding crotchety and disagreeable, the whole V/C ratio thing is fairly dumb. Is a completion of 'Nearly Pointless' that gets 10 votes equivalent to a completion of 'Trajectory of Desire' that gets 4000 votes? They have the same V/C ratio.
It's just not a useful metric for measuring anything, in my opinion.
Well that's what's wrong with mere opinions, Darky...oh dear that doesn't work as a diminutive at all, does it?? And obviously not nearly as well as Inky does for Ms. Inktea. Varky? yeeeess Varky. Well that's what's wrong with mere opinions, Varky, they're completely crotchety in the face of pure statistical factitude! Hell yeah! Look at the 1:1 theory, in light of evidence, it just sounds like shit I make up in the shower. It's like a time cube or something! Statistics usually don't tell a whole story, agreed; from one angle they're equal from another they're not. But if we choose to take them seriously, our opinions mean little in the face of them, no?
When the Sun shines upon Earth, 2 - major Time points are created on opposite sides of Earth - known as Midday and Midnight. Where the 2 major Time forces join, synergy creates 2 new minor Time points we recognize as Sunup and Sundown.
The 4-equidistant Time points can be considered as Time Square imprinted upon the circle of Earth. In a single rotation of the Earth sphere, each Time corner point rotates through the other 3-corner Time points, thus creating 16 corners, 96 hours and 4-simultaneous 24 hour Days within a single rotation of Earth - equated to a Higher Order of Life Time Cube.
-1 * -1 = -1
Been wikiin', eh?
Or maybe you're a time cube believer. I mean, it makes sense.
If you believe there's only one of each time point!. I prefer to think of things as a time sphere wherein each point on earth moves through the locus of all major and minor time points. Wouldn't you thus have, instead of 4 simultaneous 24 hour days, the volume of an earth-sized sphere in simultaneous 24 hour days? Or perhaps four to the volume of a sphere days? D = 4x104/3πr3?
All ONE MORAL ABC! ALL ONE OR NONE!
Is a completion of 'Nearly Pointless' that gets 10 votes equivalent to a completion of 'Trajectory of Desire' that gets 4000 votes?
Equivalent? Of course not.
Comparable in community appreciation? I honestly don´t know, but I think they might: see, of the 33 Nearly pointless completions there are only 3 with 10 or more votes: Cairo Zero by Petit Ogou Féraille and vertical nude self protraits by poon and anna one.
We have yet to see a Trajectory of Desire completion with 4000 votes, for comparison.
From a ransom note: This is not a joke. I enclose a joke for you to see the difference.
In any case, V/C ratio can go to hell. It is not that anyone will miss it :)
both those kinda tasks are in the 5th percentile pointswise, they make bias examples.
But mainly, this task has drive and that is what is worth rewarding.
*votes*
I tip my tasking hat to you, sir.