Lipid classification, nomenclature and structure drawing
The LIPID MAPS consortium has developed a comprehensive classification, nomenclature, and chemical representation system for lipids, the details of which are are described in the May 2009 issue of the Journal of Lipid Research:
- Fahy E, Subramaniam S, Murphy R, Nishijima M, Raetz C, Shimizu T, Spener F, van Meer G, Wakelam M and Dennis E.A.,Update of the LIPID MAPS comprehensive classification system for lipids. J. Lipid Res. (2009) 50: S9-S14.PubMed ID:19098281.
- Fahy E, Subramaniam S, Brown H, Glass C, Merrill JA, Murphy R, Raetz C, Russell D, Seyama Y, Shaw W, Shimizu T, Spener F, van Meer G, Vannieuwenhze M, White S, Witztum J and Dennis E.A.,A comprehensive classification system for lipids. J. Lipid Res. (2005) 46: 839-861.PubMed ID:15722563.
The LIPID MAPS Lipid Classification System is comprised of eight lipid categories, each with its own sublassification hierarchy. All lipids in the LIPID MAPS Structure Database (LMSD) have been classified using this system and have been assigned LIPID MAPS ID's (LM_ID) which reflects their position in the classification hierarchy. Starting from a lipid category, the user can navigate through the hierarchy by clicking on the "[+]" icon next to a main class name. This will expand that item to reveal its sub classes. Clicking on hyperlinks to the right of main classes, sub classes or level 4 classes will display a tabular listing of all lipids corresponding to that particular subset in the LMSD database. Finally, clicking on the LM_ID hyperlink displays the LMSD record for an individual lipid, wich contains an image of the molecular structure, common and systematic names, links to external databases, Wikipedia pages (where available), other annotations and links to structure viewing tools.
Category (Example: Prenol lipids [LMPR])
Main class (Example: Isoprenoids [LMPR01])
Sub class (where applicable) (Example: C15 Isoprenoids (sesquiterpenes) [LMPR0103])
Level 4 class (where applicable) (Example: Bisabolane sesquiterpenoids [LMPR010306])
The LIPID MAPS LM_ID identifier
The LIPID MAPS ID (LM_ID) is a unique identifier based on the classification scheme described above. The format of the LM_ID, outlined in the table below, provides a systematic means of assigning unique IDs to lipid molecules and allows for the addition of large numbers of new categories, classes and subclasses in the future. Currently, most LM_ID's have 12 characters, the exceptions being the molecules belonging to the level-4 classes of prenol lipids which have LM_ID's of 14 characters. The last four characters of the LM_ID comprise a unique identifier within a particular subset and are randomly assigned. In the case of generic structures containing "R" groups (such as 1,2-diacyl-sn-glycero-3-phosphocholine) the last 4 characters are all zeros (e.g. LMGP01010000). One other item worthy of note is that for most of the Glycosphingolipid subclasses, the structure of the glycan chain is known but the exact structure of the N-acyl side-chain is unknown. In these cases, the last 2 digits of the LIPID MAPS LM_ID identifier are assigned as "00" to signify an unspecified N-acyl side-chain, and the 3rd and 4th last digits are given a different 2-letter identifier for every unique glycan chain within that subclass. For example, in the Gangliosides subclass ([LMSP0601]) the GM1 generic structure is assigned an LM_ID of LMSP0601AP00 where the "AP" digits specify the unique NeuAca2-3Galβ1-3(NeuAca2-6)GalNAcβ1-3Galα1-4Galβ1-4Glcβ glycan chain and the terminal "00" digits indicate a generic ceramide structure.
Depositing new lipid structures in the LMSD and assigning LM_ID's
The Bioinformatics core of the LIPID MAPS Consortium has responsibility for registration of new lipid structures and assignment of LM_ID identifiers. Individuals who wish to deposit novel structures in the LIPID MAPS structure database (LMSD) may do so via a Web-based registration system on the web site. This enables users to enter lipid structures and accompanying names, synonyms, references, and classification information. During the submission process, structures are validated for uniqueness using a search on the current database. The submitted structures are stored in a private, temporary database where they are reviewed by LIPID MAPS bioinformatics staff prior to being classified, checked for correct nomenclature, and registered in the public LMSD.
Questions regarding the submission of structures should be submitted via our Contact page.
LIPID MAPS LM_ID format
|1-2||Fixed "LM" designation||LM||Always LM|
|3-4||2-letter category code||FA||One of 8 categories|
|5-6||2-digit class code||03||-|
|7-8||2-digit subclass code||02||May be '00' (no subclass)|
|9-10||2-digit level-4 class code||02||Only used for lipids within a level-4 class|
|Last 4 digits||Unique 4-character identifier within subclass||7312||-|
The nomenclature of lipids falls into two main categories: systematic names and common or trivial names. The latter includes abbreviations which are a convenient way to define acyl/alkyl chains in glycerolipids, sphingolipids and glycerophospholipids. The generally accepted guidelines for lipid systematic names have been defined by the International Union of Pure and Applied Chemists and the International Union of Biochemistry and Molecular Biology (IUPAC-IUBMB) Commission on Biochemical Nomenclature (http://www.chem.qmul.ac.uk/iupac/). The nomenclature proposal follows existing IUPAC-IUBMB rules closely and should not be viewed as a competing format. The main differences involve (a) clarification of the use of core structures to simplify systematic naming of some of the more complex lipids, and (b) provision of systematic names for recently discovered lipid classes. Key features of our lipid nomenclature scheme are as follows:
- The use of the stereospecific numbering (sn) method to describe glycerolipids and glycerophospholipids. The glycerol group is typically acylated or alkylated at the sn1 and/or sn2 position with the exception of some lipids which contain more thane one glycerol group and archaebacterial lipids in which sn2 and/or sn3 modification occurs.
- Definition of sphinganine and sphing-4-enine as a core structures for the sphingolipid category where the D-erythro or 2S,3R configuration and 4E geometry (in the case of sphing-4-enine) are implied. In molecules containing stereochemistries other than the 2S,3R configuration, the full systematic names are to be used instead (e.g., 2R-amino-1,3R-octadecanediol).
- The use of core names such as cholestane, androstane, and estrane, for sterols.
- Adherence to the names for fatty acids and acyl-chains (formyl, acetyl, propionyl, butyryl, etc) defined in Appendix A and B of the IUPAC-IUBMB recommendations.
- The adoption of a condensed text nomenclature for the glycan portions of lipids, where sugar residues are represented by standard IUPAC abbreviations, and where the anomeric carbon locants and stereochemistry are included but where the parentheses are omitted. This system has also been proposed by the Consortium for Functional Glycomics (http://www.functionalglycomics.org/static/index.shtml).
- The use of E/Z designations (as opposed to trans/cis) to define double-bond geometry.
- The use of R/S designations (as opposed to α/β or D/L) to define stereochemistries. The exceptions are those describing substituents on glycerol (sn) and sterol core structures, and anomeric carbons on sugar residues. In these latter special cases, the α/β format is firmly established.
- The common term "lyso", denoting the position lacking a radyl group in glycerolipids and glycerophospholipids, will not be used in systematic names, but will be included as a synonym.
- The proposal for a single nomenclature scheme to cover the prostaglandins, isoprostanes, neuroprostanes, and related compounds where the carbons participating in the cyclopentane ring closure are defined and where a consistent chain numbering scheme is used.
- The "d" and "t" designations used in shorthand notation of sphingolipids refer to 1,3 dihydroxy and 1,3,4-trihydroxy long-chain bases, respectively.
- The LIPID MAPS glycerophospholipid abbreviations (PC, PE, etc.) are used to refer to species with one or two radyl side-chains where the structures of the side chains are indicated within parentheses in the 'Headgroup(sn1/sn2)' format (e.g. PC(16:0/18:1(9Z)). By default, R stereochemistry at the C2 carbon of glycerol and attachment of the headgroup at the sn3 position. For molecules with opposite (S) stereochemistry at C2 of the glycerol group and attachment of the headgroup at the sn1 position, the stereochemistry specification of [S] is appended to the abbreviation. The 'Headgroup(sn3/sn2)' abbreviation format is used. For molecules with unknown stereochemistry at the C2 carbon of the glycerol group, the stereochemistry specification of [U] is appended to the abbreviation and the structure is drawn with C2 stereochemistry unspecified. For example: Abbreviation: PC(16:0/18:1(9E)[U]); LMID: LMGP01010582; Systematic name: 1-hexadecanoyl-2-(9E-octadecenoyl)-sn-glycero-3-phosphocholine.
- In a similar fashion, the LIPID MAPS glycerolipid abbreviations (MG,DG,TG for mono-, di- and triradyglycerols respectively) are used to refer to species with one to three radyl side-chains where the structures of the side chains are indicated within parentheses in the 'Headgroup(sn1/sn2/sn3)' format (e.g. TG(16:0/18:1(9Z)/16:0)).
- The alkyl ether linkage is represented by the "O-" prefix, e.g. DG(O-16:0/18:1(9Z)/0:0) and the (1Z)-alkenyl ether (neutral Plasmalogen) species by the "P-" prefix, e.g. DG(P-14:0/18:1(9Z)/0:0). The same rules apply to the headgroup classes within the Glycerophospholipids category. In cases where glycerolipid total composition is known, but side-chain regiochemistry and stereochemistry is unknown, abbreviations such as TG(52:1) and DG(34:2) may be used, where the numbers within parentheses refer to the total number of carbons and double bonds of all the chains.
LIPID MAPS abbreviations for glycerophospholipids
|Glycerophosphocholines||PC (LPC for lyso species)||PC(P-16:0/18:2(9Z,12Z))|
|Glycerophosphoethanolamines||PE (LPE for lyso species)||PE(O-16:0/20:4(5Z,8Z,11Z,14Z))|
|Glycerophosphoserines||PS (LPS for lyso species)||PS(16:0/18:1(9Z))|
|Glycerophosphoglycerols||PG (LPG for lyso species)||-|
|Glycerophosphates||PA (LPA for lyso species)||PA(16:0/0:0) or LPA(16:0)|
|Glycerophosphoinositols||PI (LPI for lyso species)||PI(18:0/18:0)|
LIPID MAPS abbreviations for glycerolipids
LIPID MAPS abbreviations for sphingolipids
Lipid structure drawing
Large and complex lipids are difficult to draw, which leads to the use of short hand and unique formats that often generate more confusion than clarity among lipidologists. We propose a more consistent format for representing lipid structures where, in the simplest case of the fatty acyl derivatives, the acid group (or equivalent) is drawn on the right and the hydrophobic hydrocarbon chain is on the left (see Fig. 1). Notable exceptions are found in the eicosanoid class where the hydrocarbon chain wraps around in a counterclockwise direction to produce a more condensed structure. Similarly, with regard to the glycerolipids and glycerophospholipids, the radyl chains are drawn with the hydrocarbon chains to the left and the glycerol group depicted horizontally with stereochemistry at the sn carbons defined (if known). The general term radyl is used to denote either acyl, alkyl, or 1-alkenyl substituents (http://www.chem.qmul.ac.uk/iupac/lipid/lip1n2.html), allowing for coverage of alkyl and 1Z-alkenylglycerols. The sphingolipids, although they do not contain a glycerol group, have a similar structural relationship to the glycerophospholipids in many cases and may be drawn with the C1 hydroxyl group of the long-chain base to the right and the alkyl portion to the left. This methodology places the head groups of both sphingolipids and glycerophospholipids on the right hand side. In addition, the linear prenol lipids are drawn in a fashion analogous to the fatty acids with the terminal functional group on the right hand side. Sterol lipids are universally drawn with the A ring at bottom-left and the 5-membered D ring at top-right. Inevitably a number of structurally complex lipids, such as acylaminosugar glycans, polycyclic isoprenoids, and polyketides do not lend themselves to these simplified drawing rules. Nevertheless, we believe that the adoption of the guidelines proposed here will unify chemical representation and make it more comprehensible.
Many classes of lipids lend themselves well as targets for automated structure-drawing, due to their consistent 2-dimensional layout. A suite of structure-drawing tools has been developed and deployed which permit in-silico generation of structures, systematic names and abbreviations. The structures may be viewed and exported in a variety of formats. Online versions of the structure drawing tools for fatty acyls, glycerolipids, glycerophospholipids, sphingolipids, and sterols are available in the Tools section of the LIPID MAPS website. Examples of structures for the 8 lipid categories are shown in the figure below.