OSR Rules Families: FAQ & Methodology

Thank all y’all so much for the kind words and feedback on my big math problem! Let’s just get into it: in this post, I’m going to answer some frequently asked questions and also describe the methodology of the project. Click here to view the previous post.

Frequently Asked Questions

Q1: What do the dimensions on the graph represent?

I don’t know, you tell me! The graph does not visualize any specific dimensions or variables, but only the relative distance between each ruleset and the average of each ruleset cluster. It's technically based on principal component analysis of the different dimensions, such that the percentages on each axis represent the percent similarity encoded by that axis—but it's not very useful at a glance, since it's based on a mishmash of different dimensions with different weights.

However, you could still extrapolate some relationships between rulesets on different sides of the graph, especially because whatever is on one side is different from what is on the other side. I think that from left to right, there is a spectrum from class dependency to individual character ability dependency, which can also be viewed as “more D&D” versus “less D&D”. Not sure what to make of the vertical axis.

Q2: What games exist in the gaps between the graph? Does this represent uncharted territory?

As per Q1, the graph does not represent any specific dimensions or relationships between rulesets. It only visualizes the relative distance between each ruleset and the average of each ruleset cluster (to my understanding, the % on the side represents the % similarity visualized by that dimension). Therefore the gaps are only gaps. They do not necessarily, if at all, represent potential but non-existent rulesets in between or beyond the ones given. They are just illustrative of the distance (i.e. quantified difference) between the rulesets that have been recorded. Besides, if these were gaps, would they be very interesting ones? Is anyone clamoring for, I don't know, D&D but with like ten abilities and without 1:1 combat rules? I don't know what would be different enough to be out there.

Q3: Why are some variables floating numbers (i.e. decimals) instead of booleans (i.e. 0 or 1)?

Originally, variables that represent an element of a larger set were divided by the number of elements in that set. For example, if a class is present in a ruleset, the value for that variable was 1 divided by the total number of classes in that ruleset. Theoretically, this helped to restrain the influence of group variables.

However, I did it wrong. The total score of the category would be equal to each individual column because, instead of equal to 1 like I had wanted (so that a classed game gets a difference score of 1 relative to classless games, with different classed games having a score in between 0 and 1). So, I changed those fractions to their square roots (e.g. 0.25 became 0.5) for that intended outcome.

Q4: Can you include more retroclones?

I’d be interested in including more books from the 2000s and 2010s with their own takes on the formula. However, including straight retroclones would be redundant with the specific TSR edition they emulate (unless they happen to be so different). Comparing slightly-different retroclones to TSR-era D&D is better left to literary analysis than to statistical analysis, especially when on a high level of abstraction where slight differences will not be meaningfully detected.

Q5: Where’s the GLOG?

I had originally included Arnold K.‘s GLOG in the dataset, then removed it when I wanted to be more stringent about including post-2019 games, and then finally forgot to add it again when including as many rulesets as I could think of. Skerples’ hack, Many Rats on a Stick (MROS), was included from the beginning since it dates to late 2019.

However, I think that the current selection of variables is not representative of what would distinguish the GLOG from the other rulesets: namely, its class templates for character creation and progression. I would like to think about how to represent different approaches to classes, but it’s sort of a difficult subject. Maybe I could just add a column called “Piecewise Characters”, to represent characters which freely acquire special abilities when they level up.

Eh, fuck it. I’ve just added the GLOG back (and corrected MROS). Now they’re grouped with the Whitehack stuff. Don’t ask me why. What is interesting is that MROS trends closer to the D&D-Like side, while GLOG trends closer to the D&D-Unlike side. Again, don’t ask me why. (I think because it's more of a player-side ruleset, whereas MROS is more of an out-of-the-box one.)

Q6: Can you include other editions of D&D, or PBTA games?

Statistically analyzing OSR games works because OSR games tend to be formal and predictable variations on games that are otherwise played similarly and with the same aims. Is it roll under or roll over? What classes or abilities does it have? Does it have procedures for underworld or overworld exploration? And so on.

There are four numbered editions of Dungeons & Dragons beyond the ones considered in the dataset, but each one individually has different focuses, class definitions, and other game conventions setting them apart not only from each other but, categorically, from classic and OSR games. A study to compare different editions of D&D would do better to construct the dataset specifically for D&D, focusing on variables which various editions share or do not (a mix is important).

A study of PBTA would be difficult because rules conventions between PBTA books seem to be extremely consistent with little variation, except for the fictional window dressing. If every game could potentially have its own moves and 4-6 stats, what good is it to attempt to find similarities between all of them except for some things so common that they don’t really speak to familial resemblances?

OSR games are virtually the same thing. Editions of D&D wildly differ with respect to play style and formal rules. Different PBTA games have almost the same rules. These are distinctions that are important to consider when thinking about performing a study, since you might be better served by a literary analysis than a statistical one in the other two cases.

Q7: Can you use the model to predict under what cluster a ruleset would fall?

Technically, yes: I can programmatically compare the attributes of that ruleset to the average of each cluster, and thus determine which cluster is most similar to the given ruleset. However, I am hesitant to use the model as a predictive tool since I don’t want to imply anything inherent or objective about the clusters that have been found. That is also why I haven't made a proper listing of each cluster and its contents; I'd rather this grasp at trends rather than (claim to) taxonomize rulesets.

Q8: What was your methodology?

I thought you’d never ask!

Methodology

This is basically the original methodology section I wrote before deciding to rewrite the post to be more friendly.

Working Dimensions

One of the major categorical distinctions among OSR-style manuals is whether a given ruleset is classes or classless. Among classed rulesets, i.e. those counted as having 2 or more discrete categories of characters, I measured the frequency of different classes between the rulesets. These include classes considered staples of the genre (Cleric, Fighter, Magic-User, Thief) as well as those which seem idiosyncratic but appear in multiple manuals. Some of these classes appear under different names in different rulebooks, so below are the criteria for each one as well as typical variations on their name. There are 12 total considered. Classes that are totally idiosyncratic are not considered.

Assassin: Associated with sneak attacks, disguises, and poison-making.
Barbarian: Warrior class with great strength and tribal associations.
Cleric: Priestly class with divine abilities.
Fighter (a.k.a. Warrior, Fighting-Man): Broad class skilled, in some way, at fighting.
Illusionist: Subclass of magic-user with spells focused on producing illusions.
Knight (a.k.a. Paladin): Noble or holy class associated with plate armor, horseback riding, and some magical abilities related to protection.
Magic-User (a.k.a. Mage, Wizard): Arcane class, occasionally distinguished by an academic approach.
Monk: Monastic figure, either medieval European or an Orientalist caricature.
Ranger: Survivalist class, modeled somewhat after Aragorn or Robin Hood.
Summoner: Figure who summons demons or other creatures, often occultist.
Thief (a.k.a. Expert): Class associated with unique skillsets, such as banditry.
Warlock (a.k.a. Sorcerer): Arcane class, distinguished by having innate power.

Other characteristics of characters are measured. Ancestry is defined as a character’s race or species, whose nature grants them special abilities or benefits. In some rulesets, fantasy races are encapsulated as distinct classes rather than being a separate attribute from class. An example of this can be found in Dungeons & Dragons, Basic Rules where Dwarf, Elf, and Halfling are counted as classes among Clerics, Fighters, Magic-Users, and Thieves (Moldvay 1981, p. 8). Background (a.k.a. Vocation) is similar to an ancestry, and these categories are sometimes combined. However, backgrounds are distinguished from ancestries fiction-wise by describing a character’s past experiences rather than their species’ nature. Lists of backgrounds tend to be longer than lists of ancestries, and usually proffer starting items or skills rather than special powers. Some rulesets have completely separate lists of ancestries, backgrounds, and vocations. For the purposes of this study, ancestries represent sets of “nature” characteristics while backgrounds represent sets of “nurture” characteristics. The presence of ancestries is marked with a description of their form (separate from or treated as class) while the presence of backgrounds is indicated as a Boolean value 1 (present) or 0 (absent).

Characters typically have abilities, which define broad approaches which can be taken by the character. The traditional set is of six: Charisma, Constitution, Dexterity (a.k.a. Agility), Intelligence, Strength, and Wisdom. The dataset measures the frequencies of different categories, including the traditional set and others which appear in the dataset: Luck and Willpower. Rulesets have different methods of scoring abilities, such as summing up the result of three six-sided dice. This is represented by two variables: whether or not abilities are determined via Random Scores, and whether abilities are on a Single Digit Scale or a d20 Scale (i.e. in a range of 1 to 20).

The presence of Skills (character affinities for specific tasks or situations) in each ruleset is measured as a Boolean value. Universal skill systems are usually uncommon in OSR rulesets, though Thief characters are often distinguished by having an exclusive set of skills. Finally, the method of receiving Starting Equipment is noted. Players will usually either purchase items, receive items as part of a package (sometimes class- or background-based), or determine items randomly. Often rulesets employ a mix of these methods. However, the overall thrust of the rule is what is counted; for example, packages with a few random or player-selected items are counted as packages.

There are typically rules to restrict the carrying capacity of characters. Traditionally, carrying capacity has been measured in pounds or more granular units such as “coins” such as in the original Dungeons & Dragons (Gygax & Arneson 1974, vol. 1, p. 15); these are indicated by Pound Weight and Coin Weight respectively. However, it has become popular to abstract carrying capacity in terms of discrete “slots” representing the load of a significant item (whether in terms of weight, volume, or game significance). Slot Weight indicates whether players measure items in unit-quantities, and then if their characters can only carry a handful or up to 20 of these units (usually about 10).

Encumbrance indicates whether there is granularity to a character’s carrying capacity, such as whether they are disadvantaged in movement or action for carrying greater than a certain number of items. Some items are are representative of an indeterminate or abstract quantity, such as a ration item containing 3 “uses” of a ration (or even a random number thereof); the presence of these is indicated under Usage Dice for random duration, or Usage Dots for static duration. Some items are themselves abstract, representing vague adventurer supplies or tools, and are converted by the player into specific supplies or tools at will; the presence of these is indicated under Abstract Supply or Abstract Gear respectively.

Most of the rulesets have a universal resolution procedure (UWP), which is a sort of die roll which interfaces with a character’s abilities and skills, in order to determine the likelihood of that character succeeding or failing at a particular task. The most famous UWP was popularized by the d20 System, where characters roll a twenty-sided die and add a bonus from their abilities or skills, with the aim of rolling higher than or equal to a certain target number. The dataset attempts to distinguish between Ascending and Descending algorithms since one’s preference is sort of a hot topic. Some games without a UWP are still counted as having ascending or descending rolls, based on how they handle dice rolls that are often handled via UWP. Some rulesets also have distinct probabilities, or even rules, for attacking opponents or “saving” oneself from danger. These are indicated by Boolean values Distinct Attack Score and Distinct Save Score, respectively.

Differences in attack systems are measured. The presence of Attack Rolls is distinguished between there being one-sided rolls made by the attacker, opposed rolls made by both attacker and defender, or neither. Armor System indicates the role of armor in combat, such as whether it modifies a target’s chance of being hit, or if it reduces the damage taken by that target. Hit Points indicates the nature of damage in the ruleset:

Traditional: Characters have a total number of hit points, which are depleted when the character is hit in combat.
Split HP: Hit points are split into two resources, most often representing stamina versus flesh wounds.
Dice Rolls: When a character is hit, they roll dice corresponding to their character’s current “health” or “luck” status. The specific algorithm is usually idiosyncratic to individual rulesets.

Some rulesets include a random set of events that may occur when a character “goes down”, i.e. loses all of their hit points in combat. This is usually opposed to characters simply dying, or beginning the process of death. The presence of such a table is indicated under Death & Dismemberment.

OSR rulesets typically include a magic system. In TSR-era Dungeons & Dragons, Cleric and Magic-User characters can prepare a certain number of spells per adventure or per day. The type and number of these spells depends on the level of the character in their particular class (for example, a first-level Magic-User casts less and weaker spells than a second-level Magic-User). Spells were then given in preset lists for players to choose from. Later rulesets may not distinguish between high-level and low-level spells, may have randomly-defined spells, or may not even have spellcasting restricted to certain classes. Spell Levels indicates if “levels” of magic are present in a certain system. Spell Storage indicates how spells are prepared and “held onto” by characters. For example, although earlier rulesets say that Magic-Users can prepare some number of spells per day based on their level, some later rulesets represent spells as physical books or items which can be used once per day. Spell Format indicates whether spells are typically predefined or randomly generated in the ruleset, or if they are even entirely interpretive.

Finally, the presence of certain procedures (i.e. rules which govern sequential logic of play, usually executed by the referee) typical in OSR games is noted. 1:1 Combat Procedure differs from combat rules in general since it pertains to the sequential logic of an encounter, such as who goes first and when, or what actions are available, or in what sequence they are executed. It may also include rules surrounding an encounter such as reaction determination prior to a combat (itself represented by Reaction Roll, or Reaction Test for rulesets which handle reactions via UWP), or morale checks during the encounter itself, all of which contribute to the overall procession of an encounter. Mass Combat Procedure refers to such rules but for mass combat rather than the typical one-versus-one scale. Dungeon Crawl Procedure refers more broadly to procedures for exploring a location in detail, often on the scale of approximately ten-minute actions (as opposed to representing actions which take a few seconds or a few hours). Travel Procedure refers to procedures for larger scale exploration, traversing wide stretches of land for hours or days at a time (some rulesets distinguish between day-scale and a more granular scale, relegating them to long-distance travel or land exploration respectively). Downtime/Faction Procedure refers to the presence of an overarching structure, which often handles events between sessions (in a sort of “time skip”), and which especially become more prominent as characters acquire worldly power and handle events on a political scale. Some rulesets include more or less than these, but these are the contexts in which OSR games are often expected to employ some sort of procedural logic.

Statistical Methods

I recorded the attributes of each of the sampled rulesets, as described above. Since many of the quantitative attributes are Boolean values which indicate the presence of a certain trait, the mean of a Boolean attribute represents the percent of affirmative values (1) as opposed to negative values (0). For example, if the mean of a Boolean attribute is 0.75, then 75% of records have a 1 for that attribute, and the other 25% have a 0.

Qualitative variables were abstracted in the final dataset as sets of Boolean dummy variables to represent different categories.

Variables representing elements in a set (such as the appearance of different character classes) are weighted such that the sum of squares of these variables add up to 1.

After producing the final version of the dataset, I wrote the following code in R to perform k-means cluster analysis [1].

# Include libraries.
library(tidyverse)
library(cluster)
library(stats)
library(ggpubr)
library(factoextra)

# Read data from CSV.
df <- -="" 1="" 7="" algorithm="" as.matrix="" as="" centers="num," change="" clustering="" clusters.="" code="" column="" data.csv="" data="diff)" dataframe.="" define="" df="" diag="FALSE," diff="" difference.="" difference="" dist="" distance="" from="" fviz_cluster="" generate="" his="" i="" is="" just="" k-means="" k2="" kmeans="" local="" matrix.="" max="" maximum="" minus="" name="" names="" nstart="25)" num="7" number="" of="" optional="" perform="" prefer="" ratio="" read.csv="" remove="" resultant="" retrieve="" row="" rownames="" step="" the="" to="" upper="FALSE)" using="" visualize="">

Limitations

The sample is not necessarily representative of all OSR or OSR-adjacent rulebooks, considering the wide availability of page layout software and distribution platforms which have lowered the barrier of entry to publication. It is also not necessarily representative of how hobbyists may actually play their games, whether they are using their own idiosyncratic “house rules” or are simply using rulebooks which predate the timespan of the sample. More formally, the number of people using each ruleset is just not measured, and is outside the scope of this study.

However, we may extrapolate from this study broad trends in any of these respects. This is because not only are these authors inspired by active participation in the hobby at large, but their rulebooks are significantly derivative of previous works. The OSR is both a “culture” of play, and one that is ostensibly derived ultimately from a common source (namely, TSR-era Dungeons & Dragons). The authors sampled here are working within this larger tradition and therefore can be contextualized within it.

The dimensions considered also do not represent the entirety of a ruleset, nor of any session or campaign in practice. As discussed, many of these rulesets represent player-side rules tacked onto a culturally understood set of referee-side rules and procedures, rather than a game complete out of the box; this context is important to understand the way in which such texts were expected to be used. Even more importantly, the set of variables is only useful to distinguish between rulesets on a very high level, considering attributes of these texts which are highly formal and conventional. It does not describe factors which may truly set apart one rulebook from another as a formal set of rules or as a text.

Conclusion

That’s that! If you’re curious, below is a series of screenshots using the most recent version of the dataset.

Updates

2022-01-02: The entries for S&W (and others?) are messed up because I messed with the columns in my CSV file, so they're shifted over in the Google Sheets file. Will try to fix later!

Endnotes

[1] I'd like to gently correct some commentators who have referred to the original blog post as using 'artificial intelligence' or AI. First, AI is basically a buzzword that has a wide variety of potential meanings or uses. It so happens that k-means clustering is a basic method of machine learning, a topic of computer science about performing statistical analysis and building models based off of the data in order to make predictions or generate outcomes. However, k-means clustering is not itself machine learning; by itself, it is an algorithm to determine the clusters of data based on how similar each data point is to the average of each cluster. The algorithm itself is actually very straight-forward, but it's just easier to compute automatically than by hand (especially when we're talking about 45 rulesets and 89 variables).

Bibliography

Updated from last time! I think this is now 45 different rulesets (if we combine the basic and expert rulebooks for the early-80s D&D).

Arneson, David & Gary Gygax. 1974. Dungeons & Dragons (D&D74).
Gygax, Gary. 1977-9. Advanced Dungeons & Dragons (AD&D).
Cook, David. 1981. Dungeons & Dragons: Expert Rules (D&D81).
Moldvay, Tom. 1981. Dungeons & Dragons, Basic Rules (D&D81).
Mentzer, Frank. 1983. Dungeons & Dragons, Set 1: Basic Rules (D&D83).
Mentzer, Frank. 1983. Dungeons & Dragons, Set 2: Expert Rules (D&D83).
Mentzer, Frank. 1983. Dungeons & Dragons, Set 3: Companion Rules (D&D83).
Gonnerman, Chris. 2006. Basic Fantasy Role-Playing Game (BFRPG).
Proctor, Daniel. 2007. Labyrinth Lord (LL).
Finch, Matthew J. 2008. Swords & Wizardry: Core Rules (S&W1).
Finch, Matthew J. 2009. Swords & Wizardry: Whitebox (S&W0).
Raggi, James Edward, IV. 2011. Lamentations of the Flame Princess (LotFP).
Goodman, Joseph. 2012. Dungeon Crawl Classics (DCC).
Finch, Matthew J. 2013. Swords & Wizardry: Complete Rules (S&W2).
McDowall, Chris. 2014. Into the Odd (ITO).
Mehrstram, Christian. 2015. Whitehack, Second Edition (WH2).
Milton, Ben. 2015. Maze Rats, v0.1 (MR0.1).
B., John. 2016. Into the Depths (ITD).
Black, David. 2016. The Black Hack (TBH1).
Milton, Ben. 2016. Maze Rats, v0.3 (MR0.3).
Black, David. 2018. The Black Hack Booklet, Second Edition (TBH1.5).
Black, David. 2018. The Black Hack, Second Edition (TBH2).
Milton, Ben. 2018. Knave.
Nieudan, Eric. 2018. Macchiato Monsters (MM).
Nilsson, Pelle. 2019. Mörk Borg.
(Skerples). 2019. Many Rats on Sticks, v2 (MROS).
Treme, Nate. 2019. Tunnel Goons.
Gal, Yochai. 2020. Cairn.
Hunt, Leo. 2020. Vaults of Vaarn, Issues 1-3 (VOV).
McDowall, Chris. 2020. Electric Bastionland (EB).
Mehrstram, Christian. 2020. Whitehack, Third Edition (WH3).
Williams, Isaac. 2020. Mausritter: Expanded Edition (Mausritter).
Anderson, Micah. 2021. Bastards.
Boven, Emiel. 2021. DURF.
Crawford, Kevin. 2021. Worlds Without Number (WWN).
(CavernsOfHeresy). 2021. Rogueland.
Rose, Noora. 2021. Unconquered.
Sinclair, Jared. 2021. The Vanilla Game (TVG).
Surles, Reese R. 2021. Crowns.
Bisette, Chris. 2022. A Dungeon Game (Dungeon Game).
Boven, Emiel. 2022. The Electrum Archive, Issue 1 (TEA).
Hartranft, Tobias. 2022. Trespasser: Dark Fantasy Tactics (Trespasser).
Islam, Ava. 2022. Errant.
Linderum, Markus. 2022. Down We Go (DWG).
Verte, Emmy. 2022. FLEE.
McCroo, Joshua. (Unreleased). His Majesty the Worm (HMTW).
Smith, W.F. (Unreleased). Prismatic Wasteland (PW).

Search This Blog

Traverse Fantasy