STN Database Summary Sheet
The LREGISTRY File is a training database intended for learning how to use the REGISTRY File. It is a chemical structure and dictionary database that contains approximately 125,000 substance records for compounds identified by the Chemical Abstracts Service (CAS) Registry System. These records are for the substances indexed in the LCA File and the LCASREACT File. All substance records contain a unique CAS Registry Number(R) and index name. Substance records may also have synonyms, molecular formulas, alloy composition tables, classes for polymers, nucleic acid and protein sequences, and structure diagrams, all of which are searchable and displayable.
Left truncation is available in the Chemical Name Segment (/CNS) and Notes (/NTE) fields.
LREGISTRY is a member of the following file cluster: LEARNING.
Chemical Abstracts Service 2540 Olentangy River Road P. O. Box 3012 Columbus, OH 43210 USA
In the U.K. and Ireland: The Royal Society of Chemistry (RSC) Cambridge, United Kingdom Phone: (+44) (1223) 432110 FAX: (+44) (0223) 423623 In the Federal Republic of Germany, Austria, and Switzerland: Fachinformationszentrum Chemie GmbH Berlin, Federal Republic of Germany Phone: (+49)(030) 39076-201 Fax: (+49) (030) 39076-333 In Japan: The Japan Association for International Chemical Information Tokyo, Japan Phone: (+81)(033) 5978-3601 FAX: (+81)(033) 5978-3600 In France: Compagnie d'Application et d'Assistance en Documentation (CAPADOC) Boulogne, France Phone: (+33)(01)4603-1085 FAX: (+33)(01)4603-9890 In Austrailia: Damon Ridley School of Chemistry, Fll University of Sydney NSW 2006 Sydney, Austrailia Phone: (+61) (02) 351 2180 Fax: (+61) (02) 351 6650 Email: dridley@chem.usyd.edu.au In Finland: Technical Research Centre of Finland (VTT) Espoo, Finland Phone: (+358)(90) 4564386 FAX: (+358)(90)456-4374 In Sweden: Information and Documentation Center Royal Institute of Technology Library (IDC-KTHB) Stockholm, Sweden Phone: (+46)(08) 790 89 50 FAX: (+46)(08) 790 8954 In Belgium: Royal Library NCWDT-CNDST Keizerslaan 4 Bld de l'Empereur Brussels, Belgium Phone: (+32)(02) 519.56.44 Fax: (+32)(02) 519 56 79 In the Netherlands: COBIDOC B.V. Amsterdam, The Netherlands Phone: (+31)(020)622-3955 Fax: (+31)(020)622-2556 In Isreal Arad-Ophir Information Specialists 30 Binyamin-Midodelo St. Tel-Avis ISREAL 69546 Phone: 972 3 64 83 48 8 Fax: 972 3 64 71 78 0 Spain Universitat de Barcelona Facultats de Fisica i Quimica Diagonal, 647 08028 Barcelona Office: 34 3 411 15 77 or 14 75 Fax: 34 3 411 26 11 For Argentina, Italy, Brazil, and Korea, see printed sheet, in all other countries: Chemical Abstracts Service Columbus, OH, U.S.A. Phone: 614-447-3600 Fax: 614-447-3713
| Search | | Display Search Field Name | Code | Search Examples | Codes --------------------------------|---------|--------------------------|------------- Basic Index (contains name |None |S TOSYL |AF, CN, fragments, molecular formula | (or /BI)|S DIMETHYL ADIPATE | IN, MF fragments, and Collective | |S 6CI | Index codes)(1) | |S 1,1(W)DICHLORO | | |S C5H10BR2O2 | CAS Registry Number |/RN |S 97-77-8/RN |RN, AR, | |S 97-77-8 | DR, PR Class Identifier (codes or |/CI |S MXS/CI |CI terms as a bound phrase) | |S ALLOY/CI | Component Registry Number |/CRN |S 79-10-7/CRN |CRN Definition |/DEF |S HYDROCARBONS/DEF |DEF Entry Date (2) |/ED |S 890810/ED |Not displayed Field Availability (codes |/FA |S RSD/FA AND L5 |Not displayed or terms as a bound phrase) | |S MATERIAL COMPOSITION/FA | File Segment (acronyms or |/FS |S 3D/FS |FS single words) | |S PROTEIN/FS | | |S PS/FS | | |S NUCLEIC/FS | Polymer Class Term (code |/PCT |S POLYAMINE/PCT |PCT or text) | |S PM/PCT | Registry Number Locator |/LC |S TSCA/LC |LC Update Date (2) |/UP |S UP>=890000 |Not displayed (1) Formula fragments searched in the Basic Index must be entered without spaces. (2) Numeric search field that may be searched using numeric operators or ranges. Nomenclature Fields | Search | | Display Search Field Name | Code | Search Examples | Codes --------------------------------|---------|--------------------------|------------- Chemical Name |/CN |S 1-CHLORO-1,3- |CN, IN | | BUTADIENE/CN | | |S INTERFERON .ALPHA.1?/CN | Chemical Name Segment * (1) |/CNS |S IMINO/CNS |CN, IN | |S ?QUAT?/CNS NOT AQUA | Heading Parent |/HP |S BENZOIC ACID/HP |CN, IN Index Name Segment Heading |/INS.HP |S METHYLETHYL/INS.HP |CN, IN Parent | | | Index Name Segment |/INS.NHP |S ACRYLO/INS.NHP |CN, IN Nonheading Parent | | | Other Name Segment |/ONS |S ANILINE/ONS |CN (1) With left truncation, the input term must contain at least 4 characters. Molecular Formula Fields | Search | | Display Search Field Name | Code | Search Examples | Codes --------------------------------|---------|--------------------------|------------- Atom Count (1) |/ATC |S 5/ATC |Not displayed Element Count (1) |/ELC |S 7-9/ELC |Not displayed Element Count for |/ELC.SUB |S ELC.SUB>=8 |Not displayed Substance (1) | | | Element Formula (2) |/ELF |S AL CO LA O/ELF |AF, MF Element Ratio, xx (1) |/ELR.xx |S 3.1666667/ELR.CH |Not displayed (where xx = CH, CN, CO, HC, | |S 1-2/ELR.CN | HN,HO, NC, NH, NO, OC, OH, | |S ELR.CO<=1 | or ON) | | | Element Symbol |/ELS |S B/ELS AND H/ELS |Not displayed Element Symbol for |/ELS.MCF |S (N (XA) P)/ELS.MCF |Not displayed Multicomponent Formula | | | Formula Weight (1) |/FW |S 420-460/FW |Not displayed Material Composition (3) |/MAC |S 1-5 ND/MAC |STR Molecular Formula (4) |/MF |S C7H3BR2FO2/MF |AF, MF | |S C4H4O4.2NA/MF | | |S C24 H37 OS P3/MF | Number of Components (1) |/NC |S F/ELS NOT NC>=2 |Not displayed Periodic Group |/PG |S B6/PG |Not displayed | |S LNTH/PG | Relative Composition |/RC |S FE.CR.NI/RC |Not displayed Specific Element Count (1) |/Element |S 7/SI |Not displayed | Symbol | | (1) Numeric search field that may be searched using numeric operators or ranges. (2) Formulas must be entered with spaces between the elements. (3) Combined numeric and text field. Composition terms are numeric and may be searched using numeric operators or ranges. Component terms are text terms. (4) Formulas may be entered with or without spaces. Ring Analysis Data Fields | Search | | Display Search Field Name | Code | Search Examples | Codes --------------------------------|---------|--------------------------|------------- Elemental Analysis for Ring |/EA |S C4N-C5N/EA |RSD System (1) (and number of | |S 2 C3NO-C6/EA | occurrences of EA in a | | | component structure) | | | Elemental Analysis for |/EAS |S C5NO4/EAS |Not displayed Smallest Ring (1) (and | |S >9 C6/EAS | number of occurrences of | | | EAS in a ring system) | | | Elemental Sequence for Ring |/ES |S NCOC2-C6/ES |RSD, SRSD System (1) (and number of | |S 1-3 O2C4/ES | occurrences of ES in a | | | component structure) | | | Elemental Sequence for |/ESS |S FE3/ESS |Not displayed Smallest Ring (1) (and number | |S >=2 SC2SC2/ESS | of occurrences of ESS in | | | a ring system) | | | Number of Ring Systems (2) |/NRS |S 7/NRS |Not displayed Number of Ring Systems in a |/CNRS |S 4-5/CNRS |Not displayed Component (2) | | | Number of Rings (2) (number |/NR |S 10/NR |Not displayed of smallest rings) | | | Number of Rings in a |/CNR |S CNR>=12 |Not displayed Component (2) (number of | | | smallest rings) | | | Number of Rings in Ring |/NRRS |S 5-6/NRRS |Not displayed System (2) | | | Ring Atom Count (2) |/RATC |S 4/RATC |Not displayed Ring Element (1) (and number |/REL |S SE/REL |Not displayed of occurrences of REL in a | |S 5 P/REL | ring system) | | | Ring Element Count (2) |/RELC |S 6/RELC |Not displayed Ring Elemental Formula (3,1) |/RELF |S C N O P/RELF |Not displayed (and number of occurrences | |S >3 C N O/RELF | of RELF in a component | | | structure) | | | Ring Identifier (1) (and |/RID |S 31779.1.2/RID |RSD, SRSD number of occurrences of RID | |S 1938/RID | in a component structure) | |S >=2 1949.52/RID | Ring Size of Smallest Ring (2,1)|/SZS |S 8/SZS |Not displayed (and number of occurrences of | |S 5 4/SZS | SZS in a ring system) | | | Ring System Formula (1) (and |/RF |S C20AGN4/RF |RSD number of occurrences of | |S 5 C10/RF | RF in a component structure) | | | Size for the Ring System (1) |/SZ |S 3-4-5/SZ |RSD (and number of occurrences of | |S 3 5-5-6/SZ | SZ in a component structure) | | | (1) The number of occurrences must be entered first in the search field. It is a numeric term and may be searched using numeric operators or ranges. (2) Numeric search field that may be searched using numeric operators or ranges. (3) Formulas must be entered with spaces between the elements. Biosequence Fields | Search | | Display Search Field Name | Code | Search Examples | Codes --------------------------------|---------|--------------------------|------------- Notes * (1) |/NTE |S CYCLIC/NTE |NTE | |S ?CHLORO?/NTE | | |S OAA-17/NTE | Nucleic Acid Count (2,3) |/NA.CNT |S 12-42/NA.CNT |NA Nucleic Acid Type (3) |/NA |S 12-42 A/NA |NA | |S G/NA | Sequence Length (2) |/SQL |S 4-20/SQL |SQL | |S SQL<=500 | (1) With left truncation, the input term must contain at least 4 characters. (2) Numeric search field that may be searched using numeric operators or ranges. (3) Field contains data only for nucleic acid sequences.
The CM (Component Number) field appears in records for multicomponent substances but it is not a custom display field and cannot be used in display requests.
Dictionary Field Codes Format | Content | Examples --------|----------------------------------------------|-------------- AF |Alternate Molecular Formula |D L4 1-4 AF AR |Alternate Registry Number |D L1 3 AR CCI |Component Class Identifier |D CCI 1,3-5 CCN (1) |Condensed Chemical Name |D 20 CCN CDES |Component Descriptor |D CDES 5-10 CI |Substance Class Identifier |D 1-3,7,8 CI CIL |Component Isotope at Unknown Location |D CIL CMF |Component Molecular Formula |D L1 CMF 3 CN |Chemical Name |D CN COMP (2)|Composition |D L7 CRN |Component Registry Number |D 1,3,6 CRN L5 DEF |Definition |D DEF DES |Descriptor |D DES 2 DR |Deleted Registry Number |D L8 DR 1-3 FCN (1) |Full Chemical Name |D FCN L3 7 FS |File Segment |D 1,4 FS IL |Isotope at Unknown Location |D IL IN |CA Index Name |D IN L1 4 LC |Registry Number Locator |D LC 3,4 MF |Molecular Formula |D MF PCT |Polymer Class Term |D L3 PCT PR |Preferred Registry Number |D 5,3 PR RN |CAS Registry Number |D L4 RN 3 RR |Replaced Registry Number |D L3 2 RR RSD (3) |Ring System Data |D RSD SCN (4) |Short Chemical Name |D 5-9 SCN SR |Source of Registration |D SR 1,3 L12 SRSD (5)|Short Ring System Data |D SRSD STF |Flat Structure (no stereo indicated) |D L9 1 3 STR (6) |Structure Diagram (includes stereo bonds and |D STR | R/S/E/Z labels when available) | STS (6) |Stereo Structure (includes stereo bonds when |D CN STS | available) | (1) Names are displayed with CN code. (2) This is a tabular display that lists composition information and Component Registry Numbers for alloys and tabular inorganic substances. (3) This is a tabular display that lists EA, ES, SZ, RF, RID, and RID Occurrence Count. (4) The CA Index Name and all OTHER NAMES are displayed with CN code. (5) This is a tabular display that lists EA, RID, and RID Occurrence Count. (6) Stereo structure diagrams are only available on graphics terminals and offline prints. Biosequence Field Codes Format | Content | Examples --------|----------------------------------------------|-------------- NA |Nucleic Acid |D 6 9 11 NA NTE |Note |D NTE SEQ |Sequence (1-letter codes) |D SEQ SEQ3 |Sequence (3-letter codes) |D SEQ3 1-10 SQL |Sequence Length |D L3 SQL Predefined Biosequence Formats Format | Content | Examples --------|----------------------------------------------|-------------- SQD |RN, AR, PR, DR, RR, FS, SQL, NTE, SEQ |D 5 SQD SQD3 |RN, AR, PR, DR, RR, FS, SQL, NTE, SEQ3 |D 2-4 SQD3 SQIDE |RN, CN, DEF, AR, PR, DR, RR, FS, SQL, NA, |D L4 SQIDE | NTE, SEQ, MF, AF, CI, PCT, SR, LC, IL, DES, | | STR | SQIDE3 |Same as SQIDE except that 3-letter codes are |D L4 SQIDE3 |used for protein sequences | SQN |RN, CN, AR, PR, FS, SQL, DR, RR |D SQN L5 6-9 ALL |All available fields and names, including |DISPLAY L1 1 ALL | biosequence data | FIDE |All available names and all substance data, |D FIDE 3 7 L6 | except biosequence data (RN, CN, DEF, AR, | | PR, FS, DR, RR, MF, AF, CI, PCT, SR, LC, | | IL, DES, RSD, CRN, CMF, CCI, CDES, CIL, | | STR, COMP) | IDE |Same as FIDE, except only 50 names are |D IDE L10 | displayed and RSD is not displayed (IDE is | | the default) | REG |CAS Registry Number(s) (RN, DR, AR, PR, RR) |D REG SAM |IN, SQL, MF, CI, STR, COMP |D L3 1-18 SAM SCAN (1)|IN, SQL, MF, CI, STR, COMP (answer numbers |D SCAN | are are not displayed and the answers are | | displayed in random order) | HIT (2) |All fields containing hit terms |D HIT 5-10 KWIC (2)|All hit terms plus 20 words on either side |D KWIC 5-10 (1) No online display charge for this option. SCAN must be specified on the command line, i.e., D SCAN or DISPLAY SCAN. (2) HIT and KWIC are available for all dictionary fields except MAC, RC, and CRN, and in all biosequence fields. KWIC is the same as HIT for all fields except DEF and LC. The entire field containing hit terms is highlighted except for DEF and LC in which the individual terms are highlighted. The entire RSD table is displayed without highlighting. For NTE, row(s) of the table containing the hit terms is displayed without highlighting. For SEQ and SEQ3, the amino acid codes causing the hit is highlighted by underlining and also by a statement of their position in the sequence.
Type | Definition | Code | Examples --------------|-----------------------|----------|------------------------ Sequence |Search for sequences |/SQEP |S YADAIF/SQEP Exact, | that match the query. | |S 'CYS-ASN-THR-ALA'/SQEP Protein | The query must be | | | completely defined. | | Sequence |Search for sequences |/SQEFP |S YGGFL/SQEFP Exact | that match the query | |S 'TYR-GLY-GLY- Family, | and those in which | | PHE-LEU'/SQEFP Protein | family-equivalent | | | substitution of the | | | query amino acids | | | occur (1). | | Subsequence, |Search for exact |/SQSP |S LAGLL/SQSP Protein | answers plus | | | sequences in which | |S F'HCY-STA'LF/SQSP | the query sequence | | | is embedded. | | | Variability | | | symbols are allowed. | | Subsequence |Search for exact |/SQSFP |S ATCXAWV/SQSFP Family, | sequences , | |S 'LEU-ALA-GLY-LEU- Protein | subsequences, and | | LEU'/SQSFP | answers in which | | | family-equivalent | | | substitution of the | | | query amino acids | | | occurs (1). | | Sequence |Search for sequences |/SQEN |S ATTTTTTTTTT/SQEN Exact, | that match the query. | | Nucleic | Ambiguity codes for | | Acid | nucleic acids are | | | allowed. | | Subsequence, |Search for exact |/SQSN |S AAGGTTACTA/SQSN Nucleic | answers, plus | | Acid | sequences in which | | | the query sequence is | | | embedded. Ambiguity | | | codes for nucleic | | | acids and variability | | | symbols are allowed. | | (1) The families of amino acid equivalents retrieved in protein family searches are: P, A, G, S, T (neutral, weakly hydrophobic) Q, N, E, D, B, Z (hydrophilic, acid amine) H, K, R (hydrophilic, basic) L, I, V, M (hydrophobic) F, Y, W (hydrophobic, aromatic) C (cross-link forming)
The caret is used at the beginning or at the end of a sequence to search for that sequence at the beginning or end of sequence field.
The vertical bar is the symbol for alternation, i.e., it is used to separate alternate sequence queries.
(1) For more information on specifying variability in subsequence queries, enter HELP SQQ at an arrow prompt in the Registry File.
The SORT command is used to rearrange the search results in either alphabetic or numeric order of the specified field(s).
Field Name Field Code SELECT (1) SORT Alternate Molecular Formula AF Y (2) N Alternate Registry Number AR Y (3) N CA Index Name IN Y (4) Y CAS Registry Number RN Y Y Chemical Name CN Y (5) N Class Identifier CI Y N Component Class Identifier CCI Y (6) N Component Molecular Formula CMF Y (7) N Component Registry Number CRN Y N Definition DEF Y N Deleted Registry Number DR Y (3) N Elemental Analysis for EA Y N Ring System Elemental Sequence for ES Y N Ring System File Segment FS Y Y Full Chemical Name FCN Y N Molecular Formula MF Y N Names NAME Y (8) N Nucleic Acid Sequence SQEN Y N (exact search form) Nucleic Acid Sequence SQSN Y N (subsequence search form) Polymer Class Term PCT Y N Preferred Registry Number PR Y (3) N Protein Sequence SQEFP Y N (exact family search form) Protein Sequence SQEP Y N (exact search form) Protein Sequence SQSFP Y N (subsequence family search form) Protein Sequence SQSP Y N (subsequence search form) Registry Number Locator LC Y N Registry Numbers and Names CHEM Y (9) N (default) Replacing Registry Number RR Y (5) N Ring Identifier RID Y N Ring System Formula RF Y N Sequence (1-letter codes) SEQ Y N Sequence (3-letter codes) SEQ3 Y N Sequence Length SQL N Y Short Chemical Names SCN Y (4) N Size for the Ring System SZ Y N Source of Registration SR Y N (1) HIT may be used to restrict terms extracted to terms that match the search expression used to create the answer set, e.g., SEL HIT CN. (2) /MF is appended. (3) /RN is appended. (4) /CN is appended. (5) CA Index Name, first 50 names in alphabetical order, and any additional hit names are selected. (6) /CI is appended. (7) /BI is appended. (8) All names except inverted names are selected and /BI is appended. (9) AR, DR, PR, RN, RR, and all names except inverted names are selected and /BI is appended.
Return to STN
Database Summary Sheets Home Page
Home| Products| Support|
Search