LREGISTRY

STN Database Summary Sheet

 The LREGISTRY File is a training database intended for learning how to use the REGISTRY File. It is a chemical structure and dictionary database that contains approximately 125,000 substance records for compounds identified by the Chemical Abstracts Service (CAS) Registry System. These records are for the substances indexed in the LCA File and the LCASREACT File. All substance records contain a unique CAS Registry Number(R) and index name. Substance records may also have synonyms, molecular formulas, alloy composition tables, classes for polymers, nucleic acid and protein sequences, and structure diagrams, all of which are searchable and displayable.

 Left truncation is available in the Chemical Name Segment (/CNS) and Notes (/NTE) fields.

 LREGISTRY is a member of the following file cluster: LEARNING.

 

Subject Coverage

The LREGISTRY File contains all types of chemical substances described in the literature. All types of inorganic and organic substances are covered including: alloys, biosequences, coordination compounds, minerals, mixtures, polymers and salts.

 

Sources

File Data

User Aids

Database Producer

Chemical Abstracts Service
2540 Olentangy River Road
P. O. Box 3012
Columbus, OH 43210 USA

Database Representative

In the U.K. and Ireland:  
The Royal Society of Chemistry (RSC)
Cambridge, United Kingdom
Phone: (+44) (1223) 432110
FAX:   (+44) (0223) 423623

In the Federal Republic of Germany, Austria, and Switzerland: 
Fachinformationszentrum Chemie GmbH
Berlin, Federal Republic of Germany
Phone: (+49)(030) 39076-201
Fax:   (+49) (030) 39076-333

In Japan:     
The Japan Association for International Chemical Information
Tokyo, Japan
Phone: (+81)(033) 5978-3601
FAX:   (+81)(033) 5978-3600

In France:
Compagnie d'Application et d'Assistance en Documentation (CAPADOC)
Boulogne, France
Phone: (+33)(01)4603-1085
FAX:   (+33)(01)4603-9890

In Austrailia: 
Damon Ridley
School of Chemistry, Fll
University of Sydney
NSW 2006
Sydney, Austrailia
Phone: (+61) (02) 351 2180
Fax:   (+61) (02) 351 6650
Email: dridley@chem.usyd.edu.au

In Finland:   
Technical Research Centre of Finland (VTT)
Espoo, Finland
Phone: (+358)(90) 4564386
FAX:   (+358)(90)456-4374

In Sweden:    
Information and Documentation Center Royal Institute of
 Technology Library (IDC-KTHB)
Stockholm, Sweden
Phone: (+46)(08) 790 89 50
FAX:   (+46)(08) 790 8954

In Belgium:   
Royal Library
NCWDT-CNDST
Keizerslaan 4 Bld de l'Empereur
Brussels, Belgium
Phone: (+32)(02) 519.56.44
Fax:   (+32)(02) 519 56 79

In the Netherlands: 
COBIDOC B.V.
Amsterdam, The Netherlands
Phone: (+31)(020)622-3955
Fax:   (+31)(020)622-2556

In Isreal
Arad-Ophir Information Specialists
30 Binyamin-Midodelo St.
Tel-Avis
ISREAL 69546
Phone:  972 3 64 83 48 8
Fax:    972 3 64 71 78 0

Spain
Universitat de Barcelona
Facultats de Fisica i Quimica
Diagonal, 647
08028 Barcelona
Office:  34 3 411 15 77  or 14 75
Fax:     34 3 411 26 11

For Argentina, Italy, Brazil, and Korea, see printed
 sheet, in all other countries:
Chemical Abstracts Service
Columbus, OH, U.S.A.
Phone: 614-447-3600
Fax:   614-447-3713

Search and Display Field Codes

Fields that allow left truncation are marked with an asterisk (*).
                                | Search  |                          |   Display
       Search Field Name        |  Code   |     Search Examples      |    Codes
--------------------------------|---------|--------------------------|-------------
Basic Index (contains name      |None     |S TOSYL                   |AF, CN,
 fragments, molecular formula   | (or /BI)|S DIMETHYL ADIPATE        | IN, MF
 fragments, and Collective      |         |S 6CI                     |
 Index codes)(1)                |         |S 1,1(W)DICHLORO          |
                                |         |S C5H10BR2O2              |
CAS Registry Number             |/RN      |S 97-77-8/RN              |RN, AR,
                                |         |S 97-77-8                 | DR, PR
Class Identifier (codes or      |/CI      |S MXS/CI                  |CI
 terms as a bound phrase)       |         |S ALLOY/CI                |
Component Registry Number       |/CRN     |S 79-10-7/CRN             |CRN
Definition                      |/DEF     |S HYDROCARBONS/DEF        |DEF
Entry Date (2)                  |/ED      |S 890810/ED               |Not displayed
Field Availability (codes       |/FA      |S RSD/FA AND L5           |Not displayed
 or terms as a bound phrase)    |         |S MATERIAL COMPOSITION/FA |
File Segment (acronyms or       |/FS      |S 3D/FS                   |FS
 single words)                  |         |S PROTEIN/FS              |
                                |         |S PS/FS                   |
                                |         |S NUCLEIC/FS              |
Polymer Class Term (code        |/PCT     |S POLYAMINE/PCT           |PCT
 or text)                       |         |S PM/PCT                  |
Registry Number Locator         |/LC      |S TSCA/LC                 |LC
Update Date (2)                 |/UP      |S UP>=890000              |Not displayed

(1) Formula fragments searched in the Basic Index must be entered without spaces.
(2) Numeric search field that may be searched using numeric operators or ranges.

Nomenclature Fields 

                                | Search  |                          |   Display
       Search Field Name        |  Code   |     Search Examples      |    Codes
--------------------------------|---------|--------------------------|-------------
Chemical Name                   |/CN      |S 1-CHLORO-1,3-           |CN, IN
                                |         |  BUTADIENE/CN            |
                                |         |S INTERFERON .ALPHA.1?/CN |
Chemical Name Segment * (1)     |/CNS     |S IMINO/CNS               |CN, IN
                                |         |S ?QUAT?/CNS NOT AQUA     |
Heading Parent                  |/HP      |S BENZOIC ACID/HP         |CN, IN
Index Name Segment Heading      |/INS.HP  |S METHYLETHYL/INS.HP      |CN, IN
 Parent                         |         |                          |
Index Name Segment              |/INS.NHP |S ACRYLO/INS.NHP          |CN, IN
 Nonheading Parent              |         |                          |
Other Name Segment              |/ONS     |S ANILINE/ONS             |CN

(1) With left truncation, the input term must contain at least 4 characters.

Molecular Formula Fields  

                                | Search  |                          |   Display
       Search Field Name        |  Code   |     Search Examples      |    Codes
--------------------------------|---------|--------------------------|-------------
Atom Count (1)                  |/ATC     |S 5/ATC                   |Not displayed
Element Count (1)               |/ELC     |S 7-9/ELC                 |Not displayed
Element Count for               |/ELC.SUB |S ELC.SUB>=8              |Not displayed
 Substance (1)                  |         |                          |
Element Formula (2)             |/ELF     |S AL CO LA O/ELF          |AF, MF
Element Ratio, xx (1)           |/ELR.xx  |S 3.1666667/ELR.CH        |Not displayed
 (where xx = CH, CN, CO, HC,    |         |S 1-2/ELR.CN              |
 HN,HO, NC, NH, NO, OC, OH,     |         |S ELR.CO<=1               |
 or ON)                         |         |                          |
Element Symbol                  |/ELS     |S B/ELS AND H/ELS         |Not displayed
Element Symbol for              |/ELS.MCF |S (N (XA) P)/ELS.MCF      |Not displayed
 Multicomponent Formula         |         |                          |
Formula Weight (1)              |/FW      |S 420-460/FW              |Not displayed
Material Composition (3)        |/MAC     |S 1-5 ND/MAC              |STR
Molecular Formula (4)           |/MF      |S C7H3BR2FO2/MF           |AF, MF
                                |         |S C4H4O4.2NA/MF           |
                                |         |S C24 H37 OS P3/MF        |
Number of Components (1)        |/NC      |S F/ELS NOT NC>=2         |Not displayed
Periodic Group                  |/PG      |S B6/PG                   |Not displayed
                                |         |S LNTH/PG                 |
Relative Composition            |/RC      |S FE.CR.NI/RC             |Not displayed
Specific Element Count (1)      |/Element |S 7/SI                    |Not displayed
                                | Symbol  |                          |

(1) Numeric search field that may be searched using numeric operators or ranges.
(2) Formulas must be entered with spaces between the elements.
(3) Combined numeric and text field.  Composition terms are numeric and may be
    searched using numeric operators or ranges.  Component terms are text terms.
(4) Formulas may be entered with or without spaces.

Ring Analysis Data Fields 

                                | Search  |                          |   Display
       Search Field Name        |  Code   |     Search Examples      |    Codes
--------------------------------|---------|--------------------------|-------------
Elemental Analysis for Ring     |/EA      |S C4N-C5N/EA              |RSD
 System (1) (and number of      |         |S 2 C3NO-C6/EA            |
 occurrences of EA in a         |         |                          |
 component structure)           |         |                          |
Elemental Analysis for          |/EAS     |S C5NO4/EAS               |Not displayed
 Smallest Ring (1) (and         |         |S >9 C6/EAS               |
 number of occurrences of       |         |                          |
 EAS in a ring system)          |         |                          |
Elemental Sequence for Ring     |/ES      |S NCOC2-C6/ES             |RSD, SRSD
 System (1) (and number of      |         |S 1-3 O2C4/ES             |
 occurrences of ES in a         |         |                          |
 component structure)           |         |                          |
Elemental Sequence for          |/ESS     |S FE3/ESS                 |Not displayed
 Smallest Ring (1) (and number  |         |S >=2 SC2SC2/ESS          |
 of occurrences of ESS in       |         |                          |
 a ring system)                 |         |                          |
Number of Ring Systems  (2)     |/NRS     |S 7/NRS                   |Not displayed
Number of Ring Systems in a     |/CNRS    |S 4-5/CNRS                |Not displayed
 Component (2)                  |         |                          |
Number of Rings (2) (number     |/NR      |S 10/NR                   |Not displayed
 of smallest rings)             |         |                          |
Number of Rings in a            |/CNR     |S CNR>=12                 |Not displayed
 Component (2) (number of       |         |                          |
 smallest rings)                |         |                          |
Number of Rings in Ring         |/NRRS    |S 5-6/NRRS                |Not displayed
 System (2)                     |         |                          |
Ring Atom Count (2)             |/RATC    |S 4/RATC                  |Not displayed
Ring Element (1) (and number    |/REL     |S SE/REL                  |Not displayed
 of occurrences of REL in a     |         |S 5 P/REL                 |
 ring system)                   |         |                          |
Ring Element Count (2)          |/RELC    |S 6/RELC                  |Not displayed
Ring Elemental Formula (3,1)    |/RELF    |S C N O P/RELF            |Not displayed
 (and number of occurrences     |         |S >3 C N O/RELF           |
 of RELF in a component         |         |                          |
 structure)                     |         |                          |
Ring Identifier (1) (and        |/RID     |S 31779.1.2/RID           |RSD, SRSD
 number of occurrences of RID   |         |S 1938/RID                |
 in  a component structure)     |         |S >=2 1949.52/RID         |
Ring Size of Smallest Ring (2,1)|/SZS     |S 8/SZS                   |Not displayed
 (and number of occurrences of  |         |S 5 4/SZS                 |
 SZS in  a ring system)         |         |                          |
Ring System Formula (1) (and    |/RF      |S C20AGN4/RF              |RSD
 number of occurrences of       |         |S 5 C10/RF                |
 RF in a component structure)   |         |                          |
Size for the Ring System (1)    |/SZ      |S 3-4-5/SZ                |RSD
 (and number of occurrences of  |         |S 3 5-5-6/SZ              |
 SZ in a component structure)   |         |                          |

(1) The number of occurrences must be entered first in the search field.  It is
    a numeric term and may be searched using numeric operators or ranges.
(2) Numeric search field that may be searched using numeric operators or ranges.
(3) Formulas must be entered with spaces between the elements.

Biosequence Fields  

                                | Search  |                          |   Display
       Search Field Name        |  Code   |     Search Examples      |    Codes
--------------------------------|---------|--------------------------|-------------
Notes * (1)                     |/NTE     |S CYCLIC/NTE              |NTE
                                |         |S ?CHLORO?/NTE            |
                                |         |S OAA-17/NTE              |
Nucleic Acid Count (2,3)        |/NA.CNT  |S 12-42/NA.CNT            |NA
Nucleic Acid Type (3)           |/NA      |S 12-42 A/NA              |NA
                                |         |S G/NA                    |
Sequence Length (2)             |/SQL     |S 4-20/SQL                |SQL
                                |         |S SQL<=500                |

(1) With left truncation, the input term must contain at least 4 characters.
(2) Numeric search field that may be searched using numeric operators or ranges.
(3) Field contains data only for nucleic acid sequences.

Limiting Search Codes

| Search | | Display Search Field Name | Code | Search Examples | Code ------------------------------+----------------+-----------------+------------- Answers completely iterated |/COMPLETE (1) |S L4/COM (2) |Not displayed Answers incompletely iterated |/INCOMPLETE (1) |S L4/INC (2) |Not displayed (1) The code may be abbreviated to the first three letters. (2) Only an L-number for an answer set created in REGISTRY may be limited.

Structure Search Terms

Terms (1,2) | Search Examples --------------------------------------------|--------------------- L-numbers of structures built using the |SEARCH L1 FAM SAM STRUCTURE command or uploaded from STN |SEA L1 AND L2 SSS FUL Express (Boolean logic allowed between | the L-numbers) | L-numbers of screen sets created using the |S L3 OR L4 SSS SAM SCREEN command (Boolean logic allowed | between the L-numbers) | L-numbers of structures built using the |S L1 AND L2 NOT L3 STRUCTURE command or uploaded from STN | Express combined with L-numbers of screen | sets created using the SCREEN command | (Boolean logic allowed between L-numbers) | (1) The L-number answer set from a structure search may be combined with dictionary terms, e.g., S L3 AND TSCA/LC. (2) For Sequence Search Terms see the FEATURES section.

Types of Structure Searching

| | Search | Type (1) | Definition | Code | Search Examples -----------------|---------------------------------|---------|------------------- Substructure |Search for substances that |SSS |SEARCH L1 SSS FUL (default) | match the query. | |S L2 OR L3 SSS SAM | Substitution is allowed at | |S L7 SSS RAN | at all open positions. | | | Additional components may | | | be retrieved. | | Closed |Search for substances that |CSS |SEARCH L1 CSS FUL Substructure | match the query exactly. | |S L2 NOT L3 CSS | Substitution is allowed at | |S L4 OR L5 CSS RAN | positions opened by | | | CONNECT. Additional | | | components may be | | | retrieved. | | Family |Search for substances that |FAM |S L6 FAM SAM | match the query exactly. | | | Additional components may | | | be retrieved. | | Exact |Search for substances that |EXA |SEA L5 EXA FUL | match the query exactly. | | (1) For Sequence Search Types see the Features section

Scopes of Structure Searches

| | Search | Scope | Definition | Code | Search Examples -----------------|---------------------------------|---------|------------------- Sample |Search a fixed 5% of the file. |SAM |SEARCH L3 EXA SAM (default) | | |S L6 NOT L7 SSS SAM Full |Search 100% of the file. |FUL |S L5 OR L8 SSS FUL Range |Search a user specified portion |RAN |S L4 RAN= | of the file. | | (110507-58-9,) | | |S L3 FAM RAN= | | | (109784-14-7, | | | 109904-92-9) Subset Sample |Search a fixed sample of an |SUB SAM |S L7 CSS SUB=L5 SAM | answer set created by a search | | | in LREGISTRY. | | Subset Range |Search a user specified portion |SUB RAN |S L3 SUB=L2 | of an answer set created by a | | RAN=(,50-11-3) | search in LREGISTRY | | Subset Full |Search 100% of an answer set |SUB FUL |S L8 SUB=L6 FAM FUL | created by a search in | | | LREGISTRY. | |

DISPLAY and PRINT Formats

Any combination of individual field codes and any combination of predefined format codes may be used. However, individual codes may not be combined with system predefined format codes. Multiple codes must be separated by commas or spaces. The fields are displayed in the order requested. Highlighting must be ON during SEARCH in order to use the HIT and KWIC formats.

 The CM (Component Number) field appears in records for multicomponent substances but it is not a custom display field and cannot be used in display requests.

 Dictionary Field Codes Format | Content | Examples --------|----------------------------------------------|-------------- AF |Alternate Molecular Formula |D L4 1-4 AF AR |Alternate Registry Number |D L1 3 AR CCI |Component Class Identifier |D CCI 1,3-5 CCN (1) |Condensed Chemical Name |D 20 CCN CDES |Component Descriptor |D CDES 5-10 CI |Substance Class Identifier |D 1-3,7,8 CI CIL |Component Isotope at Unknown Location |D CIL CMF |Component Molecular Formula |D L1 CMF 3 CN |Chemical Name |D CN COMP (2)|Composition |D L7 CRN |Component Registry Number |D 1,3,6 CRN L5 DEF |Definition |D DEF DES |Descriptor |D DES 2 DR |Deleted Registry Number |D L8 DR 1-3 FCN (1) |Full Chemical Name |D FCN L3 7 FS |File Segment |D 1,4 FS IL |Isotope at Unknown Location |D IL IN |CA Index Name |D IN L1 4 LC |Registry Number Locator |D LC 3,4 MF |Molecular Formula |D MF PCT |Polymer Class Term |D L3 PCT PR |Preferred Registry Number |D 5,3 PR RN |CAS Registry Number |D L4 RN 3 RR |Replaced Registry Number |D L3 2 RR RSD (3) |Ring System Data |D RSD SCN (4) |Short Chemical Name |D 5-9 SCN SR |Source of Registration |D SR 1,3 L12 SRSD (5)|Short Ring System Data |D SRSD STF |Flat Structure (no stereo indicated) |D L9 1 3 STR (6) |Structure Diagram (includes stereo bonds and |D STR | R/S/E/Z labels when available) | STS (6) |Stereo Structure (includes stereo bonds when |D CN STS | available) | (1) Names are displayed with CN code. (2) This is a tabular display that lists composition information and Component Registry Numbers for alloys and tabular inorganic substances. (3) This is a tabular display that lists EA, ES, SZ, RF, RID, and RID Occurrence Count. (4) The CA Index Name and all OTHER NAMES are displayed with CN code. (5) This is a tabular display that lists EA, RID, and RID Occurrence Count. (6) Stereo structure diagrams are only available on graphics terminals and offline prints. Biosequence Field Codes Format | Content | Examples --------|----------------------------------------------|-------------- NA |Nucleic Acid |D 6 9 11 NA NTE |Note |D NTE SEQ |Sequence (1-letter codes) |D SEQ SEQ3 |Sequence (3-letter codes) |D SEQ3 1-10 SQL |Sequence Length |D L3 SQL Predefined Biosequence Formats Format | Content | Examples --------|----------------------------------------------|-------------- SQD |RN, AR, PR, DR, RR, FS, SQL, NTE, SEQ |D 5 SQD SQD3 |RN, AR, PR, DR, RR, FS, SQL, NTE, SEQ3 |D 2-4 SQD3 SQIDE |RN, CN, DEF, AR, PR, DR, RR, FS, SQL, NA, |D L4 SQIDE | NTE, SEQ, MF, AF, CI, PCT, SR, LC, IL, DES, | | STR | SQIDE3 |Same as SQIDE except that 3-letter codes are |D L4 SQIDE3 |used for protein sequences | SQN |RN, CN, AR, PR, FS, SQL, DR, RR |D SQN L5 6-9 ALL |All available fields and names, including |DISPLAY L1 1 ALL | biosequence data | FIDE |All available names and all substance data, |D FIDE 3 7 L6 | except biosequence data (RN, CN, DEF, AR, | | PR, FS, DR, RR, MF, AF, CI, PCT, SR, LC, | | IL, DES, RSD, CRN, CMF, CCI, CDES, CIL, | | STR, COMP) | IDE |Same as FIDE, except only 50 names are |D IDE L10 | displayed and RSD is not displayed (IDE is | | the default) | REG |CAS Registry Number(s) (RN, DR, AR, PR, RR) |D REG SAM |IN, SQL, MF, CI, STR, COMP |D L3 1-18 SAM SCAN (1)|IN, SQL, MF, CI, STR, COMP (answer numbers |D SCAN | are are not displayed and the answers are | | displayed in random order) | HIT (2) |All fields containing hit terms |D HIT 5-10 KWIC (2)|All hit terms plus 20 words on either side |D KWIC 5-10 (1) No online display charge for this option. SCAN must be specified on the command line, i.e., D SCAN or DISPLAY SCAN. (2) HIT and KWIC are available for all dictionary fields except MAC, RC, and CRN, and in all biosequence fields. KWIC is the same as HIT for all fields except DEF and LC. The entire field containing hit terms is highlighted except for DEF and LC in which the individual terms are highlighted. The entire RSD table is displayed without highlighting. For NTE, row(s) of the table containing the hit terms is displayed without highlighting. For SEQ and SEQ3, the amino acid codes causing the hit is highlighted by underlining and also by a statement of their position in the sequence.

Sequence Search Terms

Terms | Search Examples --------------------------------------|----------------------------- Single letter codes for common |S LAGLL/SQSP amino acids (1) | Three-letter codes for common |S 'LEU-ALA-GLY-LEU-LEU'/SQSFP and uncommon amino acids (1) (2) |S F'HCY-STA'LF/SQSP Enclose codes or strings of codes |S 'GLP'AGYSK/SQEP in single quotes. Use dashes to |S 'CYS-ASN-THR-ALA'/SQEP separate codes in strings. | Single letter codes for nucleic |S ATTTTTTTTTT/SQEN acids (3) |S AAGGTTACTA/SQSN (1) Enter HELP AAC at an arrow prompt to display a table of the 1- and 3-letter codes for common amino acids. (2) Enter HELP AAU at an arrow prompt to display a table of the 3-letter codes for uncommon amino acids. (3) Enter HELP NUC at an arrow prompt to display a table of the codes for nucleic acids.

Types of Sequence Searches

Sequence data for nucleic acid and protein sequences are displayed in the SEQ field with 1-letter codes and the SEQ3 field with 3-letter codes for proteins only.

 

     Type     |      Definition       |   Code   |        Examples  
--------------|-----------------------|----------|------------------------
Sequence      |Search for sequences   |/SQEP     |S YADAIF/SQEP
 Exact,       | that match the query. |          |S 'CYS-ASN-THR-ALA'/SQEP
 Protein      | The query must be     |          |
              | completely defined.   |          |
Sequence      |Search for sequences   |/SQEFP    |S YGGFL/SQEFP
 Exact        | that match the query  |          |S 'TYR-GLY-GLY-
 Family,      | and those in which    |          |  PHE-LEU'/SQEFP
 Protein      | family-equivalent     |          |
              | substitution of the   |          |
              | query amino acids     |          |
              | occur (1).            |          |
Subsequence,  |Search for exact       |/SQSP     |S LAGLL/SQSP
 Protein      | answers plus          |          |
              | sequences in which    |          |S F'HCY-STA'LF/SQSP
              | the query sequence    |          |
              | is embedded.          |          |
              | Variability           |          |
              | symbols are allowed.  |          |
Subsequence   |Search for exact       |/SQSFP    |S ATCXAWV/SQSFP
 Family,      | sequences ,           |          |S 'LEU-ALA-GLY-LEU-
 Protein      | subsequences, and     |          |   LEU'/SQSFP
              | answers in which      |          |
              | family-equivalent     |          |
              | substitution of the   |          |
              | query amino acids     |          |
              | occurs (1).           |          |
Sequence      |Search for sequences   |/SQEN     |S ATTTTTTTTTT/SQEN
 Exact,       | that match the query. |          |
 Nucleic      | Ambiguity codes for   |          |
 Acid         | nucleic acids are     |          |
              | allowed.              |          |
Subsequence,  |Search for exact       |/SQSN     |S AAGGTTACTA/SQSN
 Nucleic      | answers, plus         |          |
 Acid         | sequences in which    |          |
              | the query sequence is |          |
              | embedded. Ambiguity   |          |
              | codes for nucleic     |          |
              | acids and variability |          |
              | symbols are allowed.  |          |

(1) The families of amino acid equivalents retrieved in protein
    family searches are:

        P, A, G, S, T           (neutral, weakly hydrophobic)
        Q, N, E, D, B, Z        (hydrophilic, acid amine)
        H, K, R                 (hydrophilic, basic)
        L, I, V, M              (hydrophobic)
        F, Y, W                 (hydrophobic, aromatic)
        C                       (cross-link forming)

Variability Symbols For Subsequence Searches (/SQSP, /SQSFP, and /SQSN)(1)

Symbol | Function | Search Examples -----------|------------------------------------|---------------------- [ ] |to specify alternate residues |S LGP[VL]/SQSP | |S LGP['VAL''LEU']/SQSP | | [-] |to exclude a specific residue |S LGP[-H]/SQSP | or alternate residues |S LGP[-'HIS']/SQSPSP | |S LGP[-HL]/SQSP | | {m} |to repeat the preceding sequence |S (FL){2}/SQSP | or sequence query (L#, E#, or |S L4{2}/SQSP | saved query) m times |S NAME/Q{3}/SQSP | |S (CTG){2}/SQSN | |S TAA(TAAA){2}/SQSN | | {m,u} |to repeat the preceding sequence |S GG(FL){1,2}/SQSP or | or sequence query (L#, E#, or |S L3{1,3}/SQSP {m-u} | saved query) m to u times |S NAME/Q{1,4}/SQSP | |S (CTG){1,3}/SQSN | | ? |to repeat the preceding sequence |S FLRRI(RP)?K/SQSP or | or sequence query (L#, E#, or |S FLRRI(RP){0,1}K/SQSP {0,1} | saved query) zero or one time |S L1{0-1}NN/SQSP or | |S NAME/Q{0,1}NN/SQSP {0-1} | |S CAT(CGA) | | {0,1}GGAC/SQSN | | * |to repeat the preceding sequence |S KLK(WD){0,}N/SQSP or | or sequence query (L#, E#, or |S KLK(WD)*N/SQSP {0,} | saved query) zero or more times |S L1{0-}NN/SQSP or | |S NAME/Q{0,}NN/SQSP {0-} | |S CAT(CTG) | | {0,}TATT/SQSN | | + |to repeat the preceding sequence |S KLK(DLE){1,}/SQSP or | or sequence query (L#, E#, or |S KLK(DLE)+/SQSP {1,} | saved query) one or more times |S L2{1-}/SQSP or | |S NAME/Q{1,}/SQSP {1-} | |S CAT(CTG){1,} | | TATT/SQSN | | & |to join together sequence |S L1&L3/SQSFP | expressions or queries |S L2&L5{1,3}/SQSP | (L#s, E#s, or saved queries) |S NAME1/Q{2} | | &NAME2/Q/SQSP | |S E1&E3/SQSP In addition, the caret and the vertical bar may be used.

 The caret is used at the beginning or at the end of a sequence to search for that sequence at the beginning or end of sequence field.

The vertical bar is the symbol for alternation, i.e., it is used to separate alternate sequence queries.

 

(1) For more information on specifying variability in subsequence
    queries, enter HELP SQQ at an arrow prompt in the Registry
    File.

Specifying Gaps In Subsequence Searches (/SQSP, /SQSFP, and /SQSN)

Symbol | Function | Search Examples -----------|------------------------------------|---------------------- . |a gap of one residue |S SY.RPG/SQSP | |S SY..RPG/SQSPS | |S AAG...TGC/SQSN | | .{m} |a gap of m residues |S SY.{2}RPG/SQSP or | |S SY[2.]RPG/SQSP [m.] | | | | .{m,u} |gap of m to u residues, |S GFF.{2,10}LSS/SQSP or | |S GFF.{2-10}LSS/SQSP .{m-u} | |S AAG.{2,5}TGC/SQSN | | : |gap of zero or one residues |S AGA:SRI/SQSFPS or | |S AGA.?SRI/SQSFP .? | |S AGA.{0,1}SRI/SQSFP or | |S AGA.{0-1}SRI/SQSFP .{0,1} | | or | | .{0-1} | | | | .* |gap of zero or more residue |S HLC.*TYG/SQSP or | |S HLC.{0,}TYG/SQSP .{0,} | |S HLC.{0-}TYG/SQSP or | |S AAGGCAGATG.*GCAA/SQSN .{0-} | | | | .+ |a gap of one or more residues |S SY.+TH/SQSFP or | |S SY.{1,}TH/SQSFP .{1,} | |S SY.{1-}TH/SQSFP or | |S TCCTG.+GTGG/SQSN .{1-} | |

SELECT and SORT Fields

The SELECT command is used to create E-numbers or an L-number containing terms taken from the specified field in an answer set.

 The SORT command is used to rearrange the search results in either alphabetic or numeric order of the specified field(s).

 

Field Name                     Field Code   SELECT (1)       SORT

Alternate Molecular Formula       AF           Y (2)          N
Alternate Registry Number         AR           Y (3)          N
CA Index Name                     IN           Y (4)          Y
CAS Registry Number               RN           Y              Y
Chemical Name                     CN           Y (5)          N
Class Identifier                  CI           Y              N
Component Class Identifier        CCI          Y (6)          N
Component Molecular Formula       CMF          Y (7)          N
Component Registry Number         CRN          Y              N
Definition                        DEF          Y              N
Deleted Registry Number           DR           Y (3)          N
Elemental Analysis for            EA           Y              N
 Ring System
Elemental Sequence for            ES           Y              N
 Ring System
File Segment                      FS           Y              Y
Full Chemical Name                FCN          Y              N
Molecular Formula                 MF           Y              N
Names                             NAME         Y (8)          N
Nucleic Acid Sequence             SQEN         Y              N
 (exact search form)
Nucleic Acid Sequence             SQSN         Y              N
 (subsequence search form)
Polymer Class Term                PCT          Y              N
Preferred Registry Number         PR           Y (3)          N
Protein Sequence                  SQEFP        Y              N
(exact family search form)
Protein Sequence                  SQEP         Y              N
(exact search form)
Protein Sequence                  SQSFP        Y              N
(subsequence family search form)
Protein Sequence                  SQSP         Y              N
(subsequence search form)
Registry Number Locator           LC           Y              N
Registry Numbers and Names        CHEM         Y (9)          N
                                               (default)
Replacing Registry Number         RR           Y (5)          N
Ring Identifier                   RID          Y              N
Ring System Formula               RF           Y              N
Sequence (1-letter codes)         SEQ          Y              N
Sequence (3-letter codes)         SEQ3         Y              N
Sequence Length                   SQL          N              Y
Short Chemical Names              SCN          Y (4)          N
Size for the Ring System          SZ           Y              N
Source of Registration            SR           Y              N

(1) HIT may be used to restrict terms extracted to terms that match the
    search expression used to create the answer set, e.g., SEL HIT CN.
(2) /MF is appended.
(3) /RN is appended.
(4) /CN is appended.
(5) CA Index Name, first 50 names in alphabetical order, and any
    additional hit names are selected.
(6) /CI is appended.
(7) /BI is appended.
(8) All names except inverted names are selected and /BI is appended.
(9) AR, DR, PR, RN, RR, and all names except inverted names are selected
    and /BI is appended.

Sample Records

DISPLAY IDE (default) RN 57-88-5 LREGISTRY CN Cholest-5-en-3-ol (3.beta.)- (9CI) (CA INDEX NAME) OTHER CA INDEX NAMES: CN Cholesterol (8CI) OTHER NAMES: CN (-)-Cholesterol CN .DELTA.5-Cholesten-3.beta.-ol CN 3.beta.-Hydroxycholest-5-ene CN 5:6-Cholesten-3.beta.-ol CN Cholest-5-en-3.beta.-ol CN Cholesterin CN Cholesteryl alcohol CN Dythol CN Lidinit CN Lidinite CN Provitamin D FS STEREOSEARCH MF C27 H46 O CI COM LC STN Files: ANABSTR, BEILSTEIN*, BIOBUSINESS, BIOSIS, CA, CAOLD, CAPREVIEWS, CASREACT, CEN, CHEMINFORMRX, CHEMLIST, CBNB, CIN, CJACS, CSCHEM, CSNB, DDR, DETHERM*, DRUGR, DRUGU, EMBASE, GMELIN*, HODOC*, IFICDB, IFIPAT, IFIUDB, IPA, MEDLINE, MRCK*, MSDS-OHS, MSDS-SUM, NAPRALERT, PDLCOM*, PIRA, PNI, PROMT, RTECS*, SPECINFO, TOXLINE, TOXLIT, USAN, VTB (*File contains numerically searchable property data) Other Sources: DSL**, EINECS**, TSCA** (**Enter CHEMLIST File for up-to-date regulatory information) DES 4:3B.CHOLEST Me . . . CH .....(CH2)3.........CHMe2 . . . Me . . . . C . . . C . C. . C. . C Me . . . . . . . C . C C........C . . . . . . C. .C. .C. . . . . . . C C. C . . . : . HO . .C. :C. STEREO DIAGRAM AVAILABLE WITH GRAPHICS TERMINAL OR OFFLINE PRINT DISPLAY IDE (default) RN 91386-77-5 LREGISTRY CN Interferon .alpha.1 (human leukocyte protein moiety reduced), 1-L-serine- (9CI) (CA INDEX NAME) FS PROTEIN SEQUENCE MF Unspecified CI MAN LC STN Files: CA *** STRUCTURE DIAGRAM IS NOT AVAILABLE *** *** USE 'SQD' OR 'SQIDE' FORMATS TO DISPLAY SEQUENCE *** DISPLAY SQIDE (Protein Sequence Record) RN 91386-77-5 LREGISTRY CN Interferon .alpha.1 (human leukocyte protein moiety reduced), 1-L-serine- (9CI) (CA INDEX NAME) FS PROTEIN SEQUENCE SQL 166 SEQ 1 SDLPETHSLD NRRTLMLLAQ MSRISPSSCL MDRHDFGFPQ EEFDGNQFQK 51 APAISVLHEL IQQIFNLFTT KDSSAAWDED LLDKFCTELY QQLNDLEACV 101 MQEERVGETP LMNADSILAV KKYFRRITLY LTEKKYSPCA WEVVRAEIMR 151 SLSLSTNLQE RLRRKE MF Unspecified CI MAN LC STN Files: CA DISPLAY SQIDE (Nucleic Acid Sequence) RN 91449-61-5 LREGISTRY CN Deoxyribonucleic acid (Tikaut provirus 5'-long terminal repeat) (9CI) (CA INDEX NAME) FS NUCLEIC ACID SEQUENCE SQL 641 NA 186 a 170 c 160 g 125 t NTE doublestranded SEQ 1 tgaaagaccc caccataagg cttagcaagc tagctgcagt aacgccattt 51 tgcaaggcat gaaaaagtac cagagctgag ttctcaaagt caacaacgaa 101 gtttagttaa agaataaggc tgaacaaaac tgggacaggg gccaaacagg 151 atatctgtgg tcgagcagct agggccccgg ctcagggcca agaacagatg 201 gtactcagat aaagcgaagg gctgaacaaa acgggacagg ggccaaacag 251 gatgggggcc aaacaggata tctgtggtcg agcacctggg ccccggctca 301 gggccaagaa cagatggtac tcagataaag cgaaactaac aacagtttct 351 ggaaagtccc acctcagttt caagttcccc aaaagaccgg gaaaaacccc 401 aagccttatt taaactaacc aatcagctcg cttctcgctt ctgtaacccg 451 cgctttttgc tcccagccct ataaaaaggg taaaaacccc acactcggcg 501 ccccagtcct ccgatagact gagtcgcccg ggtacccgtg tatccaataa 551 agccttttgc tgttgcatcc gaatcgtggt ctcgctgatc cttgggaggg 601 tctcctcaga gtgattgact gcccagcctg ggggtctttc a MF Unspecified CI MAN LC STN Files: CA, TOXLIT DISPLAY FIDE RN 53784-90-0 LREGISTRY CN .gamma.-Cyclodextrin, 6A,6B,6C,6D,6E,6F,6G,6H-octadeoxy- (9CI) (CA INDEX NAME) OTHER CA INDEX NAMES: CN 2,4,7,9,12,14,17,19,22,24,27,29,32,34,37,39- Hexadecaoxanonacyclo[36.2.2.23,6.28,11.213,16.218,21.223,26.228,31.2 33,36]hexapentacontane, .gamma.-cyclodextrin deriv. (9CI) MF C48 H80 O32 LC STN Files: BEILSTEIN*, CA (*File contains numerically searchable property data) DES 6:GAMMA-CYCLODEXTRIN Ring System Data Elemental | Elemental | Size of |Ring System| Ring | RID Analysis | Sequence | the Rings | Formula |Identifier|Occurrence EA | ES | SZ | RF | RID | Count ============+=============+============+===========+==========+========== C5O-C5O-C5O-|OC5-OC5-OC5- |6-6-6-6-6-6-|C40O16 |14246.1.1 |1 C5O-C5O-C5O-|OC5-OC5-OC5- |6-6-40 | | | C5O-C5O- |OC5-OC5- | | | | C24O16 |OCOC2OCOC2OCO| | | | |C2OCOC2OCOC2O| | | | |COC2OCOC2OCOC| | | | |2 | | | | OH OH . . . . OH Me . . . . HO .C. .O . C. . . O. . . . . . . . . . . . . .C. .C . . C. . C C. .C . . . . . . . . . . . . OH . . . . . . . . C . C C C . C . . .O. . . . . . . . . . . . . . . . . . . . . . . .C Me .O .O C . . . . O. . HO . . .. . C. .OH . C. .O. .C . . . . . . . . . . . . . . . . . C. . .C. O. . .C. .C. . .C. Me . . . . . . . . . . OH . . Me . . OH . . . . . . . O. .C. .C .C. O. .C. . . . . . . . . . . . C . . O. . C. .OH .C . . . . . . . . . . . . . . . Me OH Me Page 1-A OH . . . .O .C. OH . . . . . C. . C. . . . . . . C C . . . . . . . . Me .O . . O . .OH . C. . . . . . O. . .C . . . . Me . . . .C. .C. . . . . O. . C. .OH . . . . . . OH Page 1-B DISPLAY CCN CN Methanaminium, N-[4-[[4-(dimethylamino)phenyl]phenylmethylene]-2,5- cyclohexadien-1-ylidene]-N-methyl-, chloride (9CI) (CA INDEX NAME) OTHER CA INDEX NAMES: CN C.I. Basic Green 4 (8CI); Victoria Green WB (6CI) OTHER NAMES: CN Acryl Brilliant Green B; ADC Malachite Green Crystals; Aizen Malachite Green; Aizen Malachite Green Crystals; Aniline green; Astra Malachite Green; Astra Malachite Green B; Astra Malachite Green BXX; Atlantic Malachite Green; Basic Green 4; Basonyl Green 830; Benzal Green; Benzaldehyde green; Bronze Green Toner A 8002; Burma Green B; C.I. 42000; Calcozine Green V; China Green; Diabasic Malachite Green; Diamond Green B extra; Diamond Green BX; Diamond Green P Extra; Green MX; Grenoble Green; Hidaco Malachite Green Base; Hidaco Malachite Green LC; Hidaco Malachite Green SC; Light Green N; Lincoln Green Toner B 15-2900; Malachite green; Malachite Green A; Malachite Green AN; Malachite Green B; Malachite green chloride; Malachite Green CP; Malachite Green Crystals; Malachite Green Crystals BPC; Malachite Green J 3E; Malachite Green Powder; Malachite Green WS; Malachite Lake Green A; Mitsui Malachite Green; New Victoria Green Extra I; New Victoria Green Extra II; New Victoria Green Extra O; Oji Malachite Green; Solid Green Crystals O; Solid Green O; Super Ick Cure; Tertrophene Green M; Tokyo Aniline Malachite Green; Verona Basic Green M; Victoria Green; Victoria Green (basic dye); Victoria Green B; Victoria Green S; Victoria Green WPB

Return to STN Database Summary Sheets Home Page


Home| Products| Support| Search