February 23, 2012
Weekly Language Usage Tips: n or N in sample sizes & commas and adjectives or coordinate adjectives
Weekly Language Usage Tips
Tip 1: n or N in reference to sample size
A reader writes:
Can you remind me when you use a capital N and when you use a small case n for sample size?
This is a little far afield for me; I have to leave my comfortable world of language usage and venture into the scary and alien world of statistics. But you asked. So here goes:
Sample size refers to the number of units or individuals in a group being studied. We need the appropriate number of units to ensure that the sample we are studying is representative of the population of interest. If it is representative, then, we came make generalizations about the population based on our findings.
It turns out that this is not as straightforward as I expected. And it is very annoying; statistics is not supposed to be namby-pamby. I expect that there will be all kinds of variation when we talk about language usage. But statistics? I’m shocked and appalled.
It turns out that the terms are used in two different ways. One common convention is that N equals the size of the population, and n equals the sample size. Simple enough, right?
But not so fast. Some consider N to be the total sample size and n to be a subset of the sample.
See what the American Medical Association Manual of Style had to say:
N: total number of units (eg, patients, households) in the sample under study.
Example: We assessed the admission diagnoses of all patients admitted from the emergency department during a 1-month period (N = 127).
n: number of units in a subgroup of the sample under study.
Example: Of the patients admitted from the emergency department (N = 127), the most frequent admission diagnosis was unstable angina (n = 38).
Iverson C, Christiansen S, Flanagin A, et al. AMA Manual of Style: A Guide for Authors and Editors. 10th ed. New York, NY: Oxford University Press; 2007.
This sounds like N is the sample size and n is just part of the sample. Now what? What’s a simple writer supposed to do?
Well, if you are me, you consult all your friends and colleagues with statistical backgrounds to see if you can get a firm answer, and then when they all punt (thanks guys), you decide to tell people that the ways in which n and N are used vary, and you should consult the journal you are writing for to determine what convention it uses.
And then, when I was about to give up, and leave it at that, I heard from a friend:
About N. I’ve seen the conventions that you mention, and they are both widely used in specific contexts. In writing math, you can pretty freely redefine symbols if you do it explicitly (“let n be the size of the sample…”). Lots of people use N as sample size too (including me). I think statistical diction is, oddly, less strict in some ways than ordinary academic prose. There has to be flexibility. For example, a common convention in theoretical statistics is to print vectors or matrices in bold face. Except that some important journals ask you to represent matrices using capital italic letters… if so, N would be a matrix, which you don’t want.
Before I say anything else, I want to say this: PLEASE DON’T ASK ME ABOUT MATRICES AND VECTORS. My head hurts just typing the words.
I’m somewhat stunned by these statements though: “I think statistical diction is, oddly, less strict in some ways than ordinary academic prose. There has to be flexibility.” I never knew! Well, you learn something new everyday. I promise to stop thinking about statistics or statisticians as being rigid. In light of this, my advice still stands—check with the journal you are targeting, and use the journal’s method for describing samples.
I just got an email from another colleague, this time it’s from the Chair of Biostatistics. So I will let her have the last word on this subject:
I was taught
n = sample size
N = population size
If you have a subgroup sample size, it is indexed so n_i for subgroup i.
I think this is how most statisticians are taught. However, I am loath to go against the AMA advice.
The little letter for sample size mirrors other statistics such as x_i for the value of the variable X (note big letter) for member i in the sample. And also x_bar etc.
But frankly one goes with the notation of the audience so if writing for an AMA journal, use their notation.
[NOTE: The only other thing I wanted to mention is this. Did you see (in the quote above) how the AMA Manual of Style writes ‘e.g.,’? It uses ‘eg,’ and ‘ie,’ these days. I checked, and the other style guides still don’t sanction this. But the AMA? Shocking!]
Tip 2: Adjectives and commas or coordinate adjectives
A reader writes:
Sorry to bother. I’ve been sitting in my office revising a manuscript and have spent too long trying to figure this out, so I am emailing you…
The title of the manuscript is: “The 16-Hour “Long Call” Shift in the Era of Duty Hour Reform.”
An opening sentence in my conclusions is: “Compliance with a 16-hour, “long call” shift length is sensitive to both total workload and workload timing factors.”
The question is about the comma. Should there be a comma after 16-Hour? In these two examples, it seems like both should have it or not, but it just doesn’t sound right when I read it…
No bother. Actually, I appreciate this opportunity to tell you about comma use with two or more adjectives. Sometimes, you will run across a sentence with adjectives that are separated by commas, and sometimes, you will see a series of adjectives without any commas, and both will be correct grammatically! How can that be? How do we decide whether there should be a comma or not?
To explain this, I have to do something that I try to avoid in the wlut, and that is to use a grammatical term, and for that, I apologize in advance. You use commas between coordinate adjectives, and you don’t use commas between non-coordinate (or cumulative) adjectives. Coordinate adjectives are adjectives that modify a noun equally and separately. Now, what does that mean—modify a noun equally and separately? The easiest way to explain is by providing you with an example:
The old, broken-down car sat in the driveway for weeks.
In this example, the car is old, and the car is broken-down. It is both of those things, and the order doesn’t matter.
The broken-down, old car sat in the driveway for weeks.
Now, non-coordinate or cumulative adjectives modify a noun, and they can also modify the other adjectives, too. With non-coordinate adjectives, the order matters. For example:
Two bright red ribbons held her hair in place.
The adjectives, ‘two,’ ‘bright,’ and ‘red’ all describe the noun, ‘ribbons,’ but ‘bright’ also describes the type of ‘red.’ And change the order of the words, and the sentence will make no sense:
Bright red two ribbons held her hair in place. (What?)
Red bright two ribbons held her hair in place. (Huh?)
The thing to remember is that coordinate adjectives need commas. And to tell if adjectives are coordinate or not, they need to pass two tests. Let’s use the reader’s example for this.
Compliance with a 16-hour, “long call” shift length is sensitive to both total workload and workload timing factors.
The way to test it is this: reverse the order of the adjectives and see if the sentence still makes sense:
Compliance with a “long call” 16-hour shift length is sensitive to both total workload and workload timing factors.
Okay, that makes sense still. For the second test, put the word ‘and’ in between the adjectives and see if it still makes sense
Compliance with a 16-hour and “long call” shift length is sensitive to both total workload and workload timing factors.
The sentence no longer makes sense. To have a coordinate adjective (which needs a comma) both the reversing of words AND the ‘and’ test have to work. If only one part or none of the test works, then it is non-coordinate and doesn’t need a comma. So here we have non-coordinate adjectives, and the comma is not used.
Compliance with a 16-hour “long call” shift length is sensitive to both total workload and workload timing factors.
Let’s try one more example with a simpler sentence:
The quick brown fox jumps over the happy lazy dog.
[NOTE: I added the word ‘happy’ so we would have two pairs of adjectives to play with.]
For the first test, we will reverse the order of the adjectives:
The brown quick fox jumps over the lazy happy dog.
The brown quick fox doesn’t make sense, so those adjectives are non-coordinate and do not need a comma. However, the lazy happy dog still works, so we have to go on to the second test:
The quick brown fox jumps over the lazy and happy dog.
Okay, that works, too. The dog can be lazy and happy, so this pair of adjectives is coordinate and calls for a comma, and this is the way the sentence should be punctuated:
The quick brown fox jumps over the happy, lazy dog.
Got it? Good. See, no bother at all.
[NOTE: By the way, do you know why “The quick brown fox jumps over the lazy dog” sounds so familiar? It is because the sentence includes all the letters of the English alphabet, and it is commonly used to demonstrate typefaces and fonts and for testing keyboards.]