In this tutorial, I want to talk about the nomenclature of the complex substituents. In the last tutorial of this series, we discussed the fundamentals of the nomenclature and how it works. And now, we are going to add one more layer to it and learn how to name molecules that have complicated branches for substituents.
To start with, let’s talk about the groups that only contain 3 and 4 carbons. When it comes to the 3- and 4-carbon substituents, the IUPAC allows the use of the common names. Those are called the “retained common names” as they were retained when the modern system was accepted
There are only two 3-carbon chains. The first one has the connection at the end carbon, while the second one—at the middle carbon.
We call them the “n-propyl” and “isopropyl” groups. The “n” in n-propyl stands for “normal” which means that we have a straight chain with no branching. The “iso” in isopropyl refers to the “isomeric” propyl. This is a historic or a common name, so don’t try to put too much sense into it. Like many things in nomenclature, this is something that you just have to memorize like the dictionary words when you’re learning a new language.
One thing that I do want to point out though, is that the “i-” in the iso- prefix will count for the alphabetical order when we alphabetize our groups in the name. While the “n-” in the n-propyl doesn’t count. As the matter of fact, we will typically skip the “n” part altogether as it is redundant.
Now, the 4-carbon branch has quite a few variations!
We’ve got a normal butyl group, the sec-butyl, the isobutyl, and tert-butyl. The prefix “sec” stands for secondary, while the prefix “tert” stand for tertiary. Again, those are the traditional names, so you don’t have to remember the logic behind why there’s a “sec” or a “tert” prefix. But you will need to memorize those names and their corresponding structures for the test.
Also, just like in the case with isopropyl, the “i-” in the isobutyl will count for the alphabetical order while other prefixes won’t. Here’s a quick rule of thumb you can remember about which prefixes count and which ones don’t. If the prefix is spelled through the dash, it definitely won’t count. If the prefix modifies the structure and is spelled together, like cyclo- or iso- among the ones we’ve already seen, it will count. Finally, any numeric prefixes like di-, tri- tetra-, etc. won’t count either.
Alright, so those were the simple cases that you just have to memorize. How about something more complicated that we can’t commit to memory?
Here’s the example with 5 carbons. This branch is connected at the very end of the chain and doesn’t have any extra branches sticking off it.
So, as we know from the last tutorial, it’s going to be just a simple pentyl group.
Now, how about the other two? Should we use “sec” or maybe “iso” or something else?
This is where the IUPAC steps in and tells us the rule: we start numbering at the place of attachment, find the longest chain, and name the rest of the branch as if it was a standalone molecule. Then, we’ll change the ending -ane to -yl to signify that this is still a branch or a substituent and not a molecule.
Here’s how it works.
We’ll number at the carbon where it connects to the parent molecule. We then find the longest chain, which in this case is a 4-carbon chain. Then, we say that there’s a methyl group in the 1st position here and put together the rest of the name. Also, to specify that this is a complex substituent, we are going to enclose it in the parentheses. The parentheses are not optional here, so make sure you don’t forget about those on the test!
Ok. Let’s look how it works on the example.
Here we have a 6-membered ring with two branches.
The first rule of the nomenclature is to find the longest continuous chain. In this case, it’s going to be the ring itself.
Thus, our parent is going to be cyclohexane.
Next, the branch on the top is a simple ethyl group.
The bottom group is the one that we have just looked at: 1-methylbutyl.
Now, we’ll need to arrange those groups alphabetically in the name. Here’s something important about what counts for the alphabet in the complex substituents. The very first letter in the parentheses will count for the alphabetical order regardless of what it may be (a prefix or anything like that). So, in this case, we have “e” in ethyl vs “m” in the complex substituent. And since “e” is before “m” in the alphabet, the ethyl group will be first in the name. This also gives it precedence over the complex substituent for the numbering preferences since we don’t have the more substituted atom in this cycle.
So, the name for this compound is 1-ethyl-3-(1-methylbutyl)cyclohexane. Notice how we enclosed the complex substituent in the parentheses. Also, notice the punctuation. We are still using the same principles we’ve learned in the last tutorial: we use commas between numbers and dashes between the numbers and letters. So, there’s no dash, or space, or comma, or anything else between the parentheses of the complex substituent and the rest of the molecule as we treat the parentheses as a “letter” for the purposes of nomenclature.
Let’s look at a few more examples.
The first thing we want to do is to find the longest chain. In this case it’s 8-carbon chain. So, the parent will be an octane. We’ll also number the chain to give the branches the lowest possible number.
Now, we have a complex substituent, which we already know how to call. It’s an isopropyl group. And as I’ve mentioned earlier, the isopropyl is a retained common name. If we wanted to use the strict IUPAC rules, however, we’ll call it 1-methylethyl. Thus, there are two ways that we can go with, when naming this molecule.
Option one will use the common name for the group. Notice that we’re adding the prefix di- to signify that we have two isopropyl groups in this molecule.
And we have option two. In option two we use the strict IUPAC name. Notice that here we changed the numeric prefix di- to bis-.
There’s actually a difference which set of numeric prefixes we use depending on what type of a group we have.
For the simple substituents like ethyl, methyl, etc. we use the di-, tri-, tetra-, etc. We also use the same prefixes for the retained common names like isopropyl or tert-butyl. For the complex substituents, however, we use slightly modified prefixes. For 2 groups it’s bis-, for 3 it’s tris-, tetrakis- for 4, pentakis- for five, etc. It’s a rare case when you’re going to see multiples of the same complex substituent, but it’s still possible. So, it’s a good idea to know about these special prefixes in the case your instructor tries to throw a curved ball at you on the test.
Alright, here’s another example.
First thing first, we need to find the longest continuous chain in this molecule. While it may be tempting to take the two tert-butyl groups and classify them as the substituents, it won’t be correct. Here, our longest chain has 8 carbons. Also, notice the numbering that gives the smallest numbers to the branches near to the end of the molecule.
Now, let’s look at the substituents that we have here. We have a few simple methyl groups here. And we also have one complex substituent.
Just like in the previous case, we can use the retained common name here because it’s a 4-carbon group (tert-butyl). Or we can use the strict IUPAC rules and call is 1,1-dimethylethyl group.
And like in the last case, we can make two names for this compound.
One, using the retained common name. And the other one with the IUPAC name for our complex substituent.
In terms of alphabetization, we have “b” vs “m” in the first case, and “d” vs “m” in the second. Notice, that since the “dimethyl” part belongs to the complex substituent, the letter “d” in this case counts for the alphabet. Remember, that we use the first letter of the complex substituent’s name regardless of what that letter is or what part of the complex substituent it’s coming from.
I have a question that what if we take a case in which we chose the longest chain in which functional group of higher priority is present and we have a complex substituent( joined to the longest carbon chain) which too has a functional group of lower priority attached to it then how we will name the complex substituent?
The details depend on the exact example, but the general idea is the same is what I describe in this tutorial.
I’ve got a question.
So, if we have a primary amine where the the NH2 is attached to the 5th carbon in a chain of 9 carbon atoms and the second carbon from the end ( 4th carbon if we assign the nitrogen bearing carbon with the number 1) has a methyl group, would the name be
(1-butyl-4-methylpentyl)amine?
If I understood you correctly and this is what you mean, then yes, it would be an acceptable name.
what if carbon in longest chain and complex equal are same but one chain has more number of simple substituents?
Chain with more substituents takes priority provided no other differences in functional groups are present.
I was also reviewing the IUPAC Blue book! I suppose for this example, for some reason “chloro” vs “(1,3-di…)” the locant of 1 supercedes the Roman letter despite the examples in 14.5.2. My ChemDraw also gives the name that you have; I just don’t know why.
Well, a ‘typical’ alphabetical order would consider the numerals before the letters. This used to be true at some point. And I suspect that some software and a lot of textbooks and instructors use the ‘typical’ approach rather than what’s actually defined in the current IUPAC rules. ChemDraw and ChemDoodle do give incorrect names in a lot of edge cases and tend to prioritize common names over systematic names on top of that, so it’s not the ‘final authority’ and those autogenerated names should be taken with a grain of salt.
In the loading screen of the video, you write “1-(1,3-dimethylbutyl)-2-chlorocyclohexane” but you do not describe the rationale for giving the complex substituent the lower locate as compared to the alphabetically first “chloro” group. Please explain.
Complex substituents count everything inside of the parentheses for the alphabetical purposes.
I do not follow, C comes before D in the alphabet. The name is the name given by computer programs, I just don’t see why.
I should probably choose a different example for that one. There’s a small discrepancy in how IUPAC treats those in different editions. One of the earlier editions counted numbers before letters in the alphabetical order, so technically, it would be before in the alphabetical order and this is the way many textbooks and some software treats it. Current rules, would not look at the numbers. So, according to the current recommendations, you’re correct, chloro should go first. As I’ve mentioned in one of my posts, nomenclature is one of those topics where things are taught in a non-systematic way and there’s always a mixup between the current recommended rules and what used to be rules and what different instructors/authors remember or use. There’s not even an agreement between the journals. So, I guess, for as long as people can understand each other, there’s no real reason to be a stickler to the rules. Although, some instructors and standardized testing organizations might disagree with me 🤣
I’ve been looking for a week to find how iupac defines alphabetical order. I suspected that the numerical value “inside the parentheses” somehow was in play. If you have a reference to that rule- I’d love it.
Here’s what the current (2013) rules say on the matter:
“P-14.5 ALPHANUMERICAL ORDER
Alphanumerical order has been commonly called ‘alphabetical order’. As these ordering principles do involve ordering both letters and numbers, in a strict sense, it is best called ‘alphanumerical order’ in order to convey the message that both letters and numbers are involved.
Alphanumerical order is used to establish the order of citation of detachable substituent prefixes (not the detachable saturation prefixes, hydro and dehydro), and the numbering of a chain, ring, or ring system when a choice is possible.
Alphanumerical order is applied as follows in organic nomenclature. Nonitalic Roman letters are considered first, unless used as locants or part of a compound or composite locant, for example, ‘N’ or ‘4a’ (see P-14.3), or in an isotopic descriptor. When all the Roman letters are identical, the set of locants for all initial locants for primary substituents, that is, locants appearing ahead of the first Roman letter of each primary substituent, are compared. Absence of locants is most preferred, followed by italic Roman letter locants, Greek letter locants (as in conjunctive names), if any, and arabic numerals in order from lowest to highest. Thus, the preferred order for alphanumerical order is: nonitalic Roman letters > italic letters > Greek letters.
For the sorting of nonalphanumerical characters, see P-14.6.
In these subsections the principles of alphanumerical order do not include Greek letters (except in conjunctive names) or isotopic or stereochemical descriptors.“
I have a compound That can be named in two ways –
2,5,7-trimethyl-4-(2-methylpropyl)octane
and 2,4,7-trimethyl-5-(2-methylpropyl)octane
Which one will be the correct one? As we know methyl comes before Methylpropyl, so should we take the path where we get methyl in the lowest possible numbers as in 2,4,7… or should we prioritize the complex substituent and try to give it the lowest number?
Neither group has priority here, so you need to number your principal chain to give the lowest possible numbers overall. Since you get 2,4,5,7 locants either way, the tiebreaker here will be the alphabetical order. If you use (2-methylpropyl), then it’s the latter name, if you use the isobutyl (which is allowed by IUPAC), it’ll be 4-isobutyl-2,5,7-trimethyloctane.
I have a question about complex substituents with the same first letter as another simple substituent. I have 3 simple methyl groups which would be 2,4,4- trimethyl and a complex substituent 5-(1-methylethyl). Which one would come first? They would both be alphabetized by an M so I’m confused. The parent name is octane.
Would it be:
1) 2,4,4-trimethyl-5-(1-methylethyl)octane
Or
2) 5-(1-methylethyl)-2,4,4-trimethyloctane
Thank you!
Hey Amanda,
There are two recommendations: one correct and one is “that’s how I teach chemistry in my class” style. The current (2013) IUPAC rules use the alphabetical order in the broadest sense, meaning you’ll have to go through the substituent letter by letter till you find the difference. In this case, the complex sub continues while the “methyl” ends, so the regular sub will have the alphabetical priority. Often however, the complex substituents is erroneously (and surprisingly common) given priority in cases like this. So, according to the current recommendations, your first name would be correct. However, in my experience, not every instructor “got the memo” 😜 so, if that’s something you’re concerned for your class, you might wanna ask your instructor how they would grade it. If they write your final exam, it’s up to them. If you have a standardized final, like an ACS exam or similar, the former option would be the more correct one. I’m saying “more” correct, b/c neither name is, strictly speaking, an IUPAC Preferred Name (PIN) according to the current recommendations. But you know, the current rules have been published in 2013 and we still teach the “incorrect” ones even according to the textbooks published this year 😹
shouldn’t 2,2-dimethyl come before 4-(2,2-dimethylpropyl) in the last structure because dimethyl comes before dimethylpropyl in the english dictionary hence alphabetically 2,2-dimethyl should be written before the complex substituent
Nope. A complex substituent counts everything inside of parentheses for the alphabetical order, so you have “d” in (2,2-dimethylpropyl) vs “m” in methyl. Notice, for simple substituents, the numeric prefix (“di” in this case) is not counted for the alphabetization.
I can’t draw the compound here but it’s name is given as 3-(dibromomethyl)-4-(tribromomethyl) hexane …..if we ignore di and tri in iupac why can’t it be 4-dibromomethyl 4tribromomethyl hexane
We don’t ignore the numeric prefixes for the purposes of the alphabetic order if they are a part of a complex name inside of parentheses.
*2-(1-methyl ethyl)-3-(2-cyclohexyl-1-cycloprop-2-enyl ethyl).
*3-(2-cyclohexyl-1-cycloprop-2-enyl ethyl)-2-(1-methyl ethyl)
the later one is correct , am i right?
From the pure alphabetical order perspective of ordering your complex substituents, yes. Also, no space before ethyls in your complex substituents. I’m now curious what in the hell this monstrosity is that would require complex substituents like this. What’s the complete name?
in the last structure
shouldn’t ethyl come before methyl propyl???
There’s no ethyl in the last structure.
I’ve two questions.
1. Sec-butyl comes before tert-butyl in alphabetical order. How about butyl itself? is it comes before/after sec-/tert-butyl? Is there any rules applied?
2. In cycloalkane, numbering starts from butyl or sec-/tert-butyl if only these two substituents present?
Regular butyl should come in first. You’d also start numbering from butyl in the case of a cycle. Side note: you should use the complex nomenclature instead of common names as a general rule of thumb.