Database Name: REGISTRY - The CAS Registry File of Substances Description: The REGISTRY File is a chemical structure and dictionary database containing unique substance records that are produced as new substances are identified by the Chemical Abstracts Service (CAS) Registry System. The REGISTRY File contains records for all the substances cited in the CAS Registry System. These include substances cited in CAplus, CA, and CAOLD files, and special registrations, for example, registrations for regulatory lists such as TSCA and EINECS. All substance records contain a unique CAS REGISTRY Number (R) and index name. Substance records may also have synonyms, molecular formulas, alloy composition tables, classes for polymers, nucleic acid and protein sequences, ring analysis data, and structure diagrams, all of which are searchable and displayable. Nucleic acid sequences from GenBank (R) (registered trademark of the U.S. Department of Health and Human Services) are also included. Also displayable in the REGISTRY File are the 10 most recent CA references citing the substance since 1967, the total number of citations to the substance in the CAplus, CA, and CAOLD files, and the total number of citations in the CA File for the non-specific derivatives. LREGISTRY is the companion training file for learning how to use the REGISTRY File. Subject Coverage: The Registry File contains all types of chemical substances described in the literature. All types of inorganic and organic substances are covered including: alloys, biosequences, coordination compounds, minerals, mixtures, polymers, and salts. Chemistry; Patents; Pharmaceutical Chemicals; Polymers; Structures (Chemical) Sources: The CAS Registry System, which identifies new substances described in such sources as journal articles, patents, and conference proceedings and substances on regulatory lists; GenBank File Data: 1957-present; more than 15.8 million records (10/96); updated weekly; automatic current-awareness searches (SDIs) are run every two weeks Language: English User Aids: Adding Screens in Structure Searching; Building and Searching Structures on STN Manual; CA Index Guides; Finding and Verifying CAS Registry Numbers on STN; How to Search for CAS Registry Numbers in the CAS Registry File Quick Reference Card; Naming & Indexing of Chemical Substances for CA Appendix IV; Nucelic Acids on STN - A Quick Reference Guide; Polymer Class Terms; Polymer Information on STN - A Quick Reference Guide; Protein Sequences on STN - A Quick Reference Guide; REGISTRY File: Basic Name Segment Dictionary; REGISTRY File: Biosequence Searching Manual; REGISTRY File: Dictionary Searching Manual; Screen Dictionary; Searching CASLINK Quick Reference Card; Searching Coordination Compounds; Searching for Polymer Information; STNote 4: Send and view graphic images in STNmail; STNote 6: CAS Registry Enhancements; STNote 7: Current awareness on STN; Structuring Functional Groups; Using Stereosearch Quick Reference Card; Using the CAS Registry File on STN Student Manual; Using the CAS Registry File on STN Structure Searching Student Manual; Online Helps (HELP DIRECTORY lists all help messages available); STNGUIDE Database Producer: Chemical Abstracts Service 2540 Olentangy River Road P. O. Box 3012 Columbus, OH 43210-0012 USA Database Representatives: Representative in the U.K. and Ireland: The Royal Society of Chemistry (RSC) Cambridge, United Kingdom Phone: (+44) (1223) 432110 FAX: (+44) (1223) 423623 Representative in the Federal Republic of Germany, Austria, and Switzerland: Fachinformationszentrum Chemie GmbH Berlin, Federal Republic of Germany Phone: (+49)(030) 39076-201 Fax: (+49) (030) 39076-333 Representative in Japan: The Japan Association for International Chemical Information Tokyo, Japan Phone: (+81)(033) 5978-3601 FAX: (+81)(033) 5978-3600 Representative in France: Compagnie d'Applications et d'Assistance en Documentation (CAPADOC) Boulogne, France Phone: (+33)(01)4603-1085 FAX: (+33)(01)4603-9890 Representative in Australia: Damon Ridley School of Chemistry, F11 University of Sydney NSW 2006 Sydney, Australia Phone: (+61)(02) 351 2180 Fax: (+61) (02) 351 6650 e-mail: dridley@chem.usyd.edu.au Representative in Finland: VTT Information Service Espoo, Finland Phone: (+358) (90) 4564386 Fax: (+358) (90) 4564374 Representative in Sweden: Information and Documentation Center, Royal Institute of Technology Library (IDC-KTHB) Stockholm, Sweden Phone: (+46)(08) 790 89 50 FAX: (+46)(08) 790 89 54 Representative in Belgium: Royal Library NCWDT-CNDST Keizerslaan 4 Bld de l'Empereur Brussels, Belgium Phone: (+32)(02) 519.56.44 Fax: (+32)(02) 519.56.79 Representative in the Netherlands: COBIDOC B.V. Amsterdam, The Netherlands Phone: (+31)(020)622-3955 Fax: (+31)(020)622-2556 Representative in Argentina: (Asociacion Quimica Argentina) Sanchez De Bustamante 1749 1425 Buenos Aires, Argentina Phone: (+54) 1 824-7986 Fax: (+54) 1 822-4886 Representative in Israel: (Arad-Ophir Information Specialists) 30 Binyamin-Midodelo St. Tel-Avis Israel 69546 Phone: (+972) 3 64 83 48 8 Fax: (+972) 3 64 71 78 0 Represenative in Italy: (Frances H. Barker) Newlands, Loughborough Road Ruddington Notts, U.K> NG116LU Phone: (+44) 1159/211-148 Fax: (+44) 1159/211-148 Representative in Korea: (Korea Institute of Industry and Technology Information (KINITI)) Seoul 130-742 Korea Phone: (82) (0) 962-6211/8 Fax: (82) (0) 962-7199 Representative in Spain (CEDETECH) Universitat de Barcelona Facultats de Fisica i Quimica Diagonal, 647 08028 Barcelona Spain Phone: (+34) 3 411 15 77 Fax: (+34) 3 411 26 11 In all other countries: Chemical Abstracts Service Columbus, OH, U.S.A. Phone: 614-447-3600 Fax: 614-447-3713 Search and Display Fields: Fields that allow left truncation are marked with an asterisk (*). | Search | Search | Display Search Field Name | Code | Example | Codes =============================+========+==================+============== Basic Index (contains name |None (or|S TOSYL |AF, CN, IN, fragments, molecular formula| /BI) |S DIMETHYL | fragments, and Collective | | ADIPATE | Index codes)(1) | |S 6CI | | |S 1,1(W)DICHLORO | | |S C5H10BR2O2 | -----------------------------+--------+------------------+-------------- CAS Registry Number |/RN |S 97-77-8/RN |RN, AR, DR, PR | |S 97-77-8 | Class Identifier (codes or |/CI |S MXS/CI |CI terms as a bound phrase) | |S ALLOY/CI | Component Registry Number |/CRN |S 79-10-7/CRN |CRN Definition |/DEF |S HYDROCARBONS/DEF|DEF Entry Date (2) |/ED |S 920105/ED |Not displayed Field Availability (codes or |/FA |S RSD/FA AND L5 |Not displayed terms as a bound phrase) | |S MATERIAL | | | COMPOSITION/FA | File Segment (acronyms or |/FS |S 3D/FS |FS single words) | |S PROTEIN/FS | | |S PS/FS | | |S NUCLEIC/FS | Number of References in the |/REF.CA |S L1 AND |REF CA File (2) | | REF.CA<=10 | Number of References in the |/REF.CAD|S L3 AND 1/REF.CAD|REF CA File for Non-Specific | | | Derivatives (2) | | | Number of References in the |/REF. |S L2 AND |REF CAOLD File (2) | CAOLD | 1-5/REF.CAOLD | Number of References in the |/REF. |S L2 NOT |REF CAplus File (2) | CAPLUS | REF.CAPLUS>10 | Number of References in the |/REF.CAP|S REF.CAP>=1 |REF CAPREVIEWS File (2) | | | Polymer Class Term (code or |/PCT |S POLYAMINE/PCT |PCT text) | |S PM/PCT | Registry Number Locator |/LC |S TSCA/LC |LC | |S GENBANK/LC | | | S L1 AND CA/LC | Update Date (2) |/UP |S UP>=920209 |Not displayed (1) Formula fragments searched in the Basic Index must be entered without spaces. (2) Numeric search field that may be searched using numeric operators or ranges. Nomenclature Fields | Search | Search | Display Search Field Name | Code | Example | Codes =============================+========+==================+============== Chemical Name |/CN |S 1-CHLORO-1,3- |CN, IN | | BUTADIENE/CN | | |S INTERFERON | | | .ALPHA.1?/CN | | |S GENBANK | | | M12334/CN | Chemical Name Segment * (1) |/CNS |S IMINO/CNS |CN, IN | |S ?QUAT?/CNS NOT | | | AQUA | Heading Parent |/HP |S BENZOIC ACID |CN, IN | | /HP | Index Name Segment Heading |/INS.HP |S METHYLETHYL |CN, IN Parent | | /INS.HP | Index Name Segment NonHeading|/INS.NHP|S ACRYLO/INS.NHP |CN, IN Parent | | | Other Name Segment |/ONS |S ANILINE/ONS |CN (1) With left truncation, the input term must contain at least 4 characters. Molecular Formula Fields | Search | Search | Display Search Field Name | Code | Example | Codes =============================+========+==================+============== Atom Count (1) |/ATC |S 5/ATC |Not displayed Element Count (1) |/ELC |S 7-9/ELC |Not displayed Element Count for Substance |/ELC.SUB|S ELC.SUB>=8 |Not displayed (1) | | | Element Formula (2) |/ELF |S AL CO LA O/ELF |AF, MF Element Ratio, xx (1) (where |/ELR.xx |S 3.1666667/ELR.CH|Not displayed xx = CH, CN, CO, HC, HN, HO,| |S 1-2/ELR.CN | NC, NH, NO, OC, OH, or ON) | |S ELR.CO<=1 | Element Symbol |/ELS |S B/ELS AND H/ELS |Not displayed Element Symbol for |/ELS.MCF|S (N (XA) P) |Not displayed Multicomponent Formula | | /ELS.MCF | Formula Weight (1) |/FW |S 420-460/FW |Not displayed Material Composition (3) |/MAC |S 1-5 ND/MAC |STR Molecular Formula (4) |/MF |S C7H3BR2FO2/MF |AF, MF | |S C4H4O4.2NA/MF | | |S C24 H37 OS P3/MF| Number of Components (1) |/NC |S F/ELS NOT NC>=2 |Not displayed Periodic Group |/PG |S B6/PG |Not displayed | |S LNTH/PG | Relative Composition |/RC |S FE.CR.NI/RC |Not displayed Specific Element Count (1) |/Element|S 7/SI |Not displayed |Symbol | | (1) Numeric search field that may be searched using numeric operators or ranges. (2) Formulas must be entered with spaces between the elements. (3) Combined numeric and text field. Composition terms are numeric and may be searched using numeric operators or ranges. Component terms are text terms. (4) Formulas may be entered with or without spaces. Ring Analysis Data Fields | Search | Search | Display Search Field Name | Code | Example | Codes =============================+========+==================+============== Elemental Analysis for Ring |/EA |S C4N-C5N/EA |RSD System (1) (and number of | |S 2 C3NO-C6/EA | occurrences of EA in a | | | component structure) | | | Elemental Analysis for |/EAS |S C5NO4/EAS |Not displayed Smallest Ring (1) (and | |S >9 C6/EAS | number of occurrences of EAS| | | in a ring system) | | | Elemental Sequence for Ring |/ES |S NCOC2-C6/ES |RSD, SRSD System (1) (and number of | |S 1-3 O2C4/ES | occurrences of ES in a | | | component structure) | | | Elemental Sequence for |/ESS |S FE3/ESS |Not displayed Smallest Ring (1) (and | |S >=2 SC2SC2/ESS | number of occurrences of ESS| | | in a ring system) | | | Number of Ring Systems (2) |/NRS |S 7/NRS |Not displayed Number of Ring Systems in a |/CNRS |S 4-5/CNRS |Not displayed Component (2) | | | Number of Rings (number of |/NR |S 10/NR |Not displayed smallest rings) (2) | | | Number of Rings in a |/CNR |S CNR>=12 |Not displayed Component (number of | | | smallest rings)(2) | | | Number of Rings in Ring |/NRRS |S 5-6/NRRS |Not displayed System (2) | | | Ring Atom Count (2) |/RATC |S 4/RATC |Not displayed Ring Element (1) (and number |/REL |S SE/REL |Not displayed of occurrences of REL in a | |S 5 P/REL | ring system) | | | Ring Element Count (2) |/RELC |S 6/RELC |Not displayed Ring Elemental Formula (1,3) |/RELF |S C N O P/RELF |Not displayed (and number of occurrences | |S >3 C N O/RELF | of RELF in a component | | | structure) | | | Ring Identifier (1) (and |/RID |S 31779.1.2/RID |RSD, SRSD number of occurrences of RID| |S 1938/RID | in a component structure) | |S >=2 1949.52/RID | Ring Size of Smallest Ring |/SZS |S 8/SZS |Not displayed (1,2) (and number of | |S 5 4/SZS | occurrences of SZS in a ring| | | system) | | | Ring System Formula (1) (and |/RF |S C20AGN4/RF |RSD number of occurrences of RF | |S 5 C10/RF | in a component structure) | | | Size for the Ring System (1) |/SZ |S 3-4-5/SZ |RSD (and number of occurrences | |S 3 5-5-6/SZ | of SZ in a component | | | structure) | | | (1) The number of occurrences must be entered first in the search field. It is a numeric term and may be searched using numeric operators or ranges. (2) Numeric search field that may be searched using numeric operators or ranges. (3) Formulas must be entered with spaces between the elements. Sequence Fields | Search | Search | Display Search Field Name | Code | Example | Codes =============================+========+==================+============== Notes * (1) |/NTE |S CYCLIC/NTE |NTE | |S ?CHLORO?/NTE | | |S OAA-17/NTE | Nucleic Acid Count (2,3) |/NA.CNT |S 12-42/NA.CNT |NA Nucleic Acid Type (3) |/NA |S 12-42 A/NA |NA | |S G/NA | Sequence Length (2) |/SQL |S 4-20/SQL |SQL | |S SQL<=500 | (1) With left truncation, the input term must contain at least 4 characters. (2) Numeric search field that may be searched using numeric operators or ranges. (3) Field contains data only for nucleic acid sequences. Limiting Search Codes: Only an L-number for an answer set created in REGISTRY may be limited. | Search | Search Search Field Name | Code (1) | Example =============================+===========+=============== Answers completely iterated |/COMPLETE |S L4/COM Answers incompletely iterated|/INCOMPLETE|S L4/INC (1) The code may be abbreviated to the first three letters. Structure Search Terms: Structure Search Terms (1,2) |Search Examples ==========================================+========================= L-numbers of structures built using the |SEARCH L1 FAM SAM STRUCTURE command or uploaded from STN |SEA L1 AND L2 SSS FUL Express (Boolean logic allowed between | the L-numbers) | L-numbers of screen sets created using the|S L3 OR L4 SSS SAM SCREEN command (Boolean logic allowed | between the L-numbers) | L-numbers of structures built using the |S L1 AND L2 NOT L3 STRUCTURE command or uploaded from STN | Express combined with L-numbers of screen| sets created using the SCREEN command | (Boolean logic allowed between L-numbers)| (1) The L-number answer set from a structure search may be combined with dictionary terms, e.g., S L3 AND TSCA/LC. (2) For Sequence Search Terms see the FEATURES section. Structure Search Types: Structure | |Search| Search Search Type | Definition | Code | Examples ============ +==========================+======+================== Substructure |Search for substances that|SSS |SEARCH L1 SSS FUL (default) | match the query. | |S L2 OR L3 SSS SAM | Substitution is allowed | |S L7 SSS RAN | at all open positions. | | | Additional components may| | | be retrieved. | | Closed |Search for substances that|CSS |SEARCH L1 CSS FUL Substructure| match the query exactly. | |S L2 NOT L3 CSS | Substitution is allowed | |S L4 OR L5 CSS RAN | at positions opened by | | | CONNECT. Additional | | | components may be | | | retrieved. | | Family |Search for substances that|FAM |S L6 FAM SAM | match the query exactly. | | | Additional components may| | | be retrieved. | | Exact |Search for substances that|EXA |SEA L5 EXA FUL | match the query exactly | | Structure Search Scopes: Structure | |Search | Search Search Scope | Definition | Code | Examples =============+==================+=======+=================== Sample |Search a fixed 5% |SAM |SEARCH L3 EXA SAM (default) | of the file | |S L6 NOT L7 SSS SAM Full |Search 100% of the|FUL |S L5 OR L8 SSS FUL | file | | Range |Search a user |RAN |S L4 RAN=(110507-58-9,) | specified portion| |S L3 FAM RAN=(109784-14-7, | of the file | | 109904-92-9) Subset Sample|Search a fixed |SUB SAM|S L7 CSS SUB=L5 SAM | sample of an | | | answer set | | | created by a | | | search in | | | REGISTRY | | Subset Range |Search a user |SUB RAN|S L3 SUB=L2 | specified portion| | RAN=(,50-11-3) | of an answer set | | | created by a | | | search in | | | REGISTRY | | Subset Full |Search 100% of an |SUB FUL|S L8 SUB=L6 FAM FUL | answer set | | | created by a | | | search in | | | REGISTRY | | Sequence Search Terms: Terms | Search Examples ===================================+================================ Single letter codes for common |S LAGLL/SQSP amino acids (1) | Three-letter codes for common |S 'LEU-ALA-GLY-LEU-LEU'/SQSFP and uncommon amino acids (1) (2) |S F'HCY-STA'LF/SQSP Enclose codes or strings of codes |S 'GLP'AGYSK/SQEP in single quotes. Use dashes to |S 'CYS-ASN-THR-ALA'/SQEP separate codes in strings. | Single letter codes for nucleic |S ATTTTTTTTTT/SQEN acids (3) |S AAGGTTACTA/SQSN (1) Enter HELP AAC at an arrow prompt to display a table of the 1- and 3-letter codes for common amino acids. (2) Enter HELP AAU at an arrow prompt to display a table of the 3-letter codes for uncommon amino acids. (3) Enter HELP NUC at an arrow prompt to display a table of the codes for nucleic acids. Sequence Search Types: Sequence data for nucleic acid and protein sequences are displayed in the SEQ field with 1-letter codes and the SEQ3 field with 3-letter codes for proteins only. Type | Definition | Code | Examples ===========+=======================+=======+========================== Sequence |Search for sequences |/SQEP |S YADAIF/SQEP Exact, | that match the query. | |S 'CYS-ASN-THR-ALA'/SQEP Protein | The query must be | | | completely defined. | | Sequence |Search for sequences |/SQEFP |S YGGFL/SQEFP Exact | that match the query | |S 'TYR-GLY-GLY-PHE-LEU'/SQEFP Family, | and those in which | | Protein | family-equivalent | | | substitution of the | | | query amino acids | | | occur (1) | | Subsequence|Search for exact |/SQSP |S LAGLL/SQSP Protein | answers plus | | | sequences in which | |S F'HCY-STA'LF/SQSP | the query sequence | | | is embedded. | | | Variability symbols | | | are allowed. | | Subsequence|Search for exact |SQSFP |S ATCXAWV/SQSFP Family, | sequences, | |S 'LEU-ALA-GLY-LEU-LEU'/SQSFP Protein | subsequences, and | | | answers in which | | | family-equivalent | | | substitution of the | | | query amino acids | | | occurs (1) | | Sequence |Search for sequences |/SQEN |S ATTTTTTTTTT/SQEN Exact, | that match the query. | | Nucleic | Ambiguity codes for | | Acid | nucleic acids are | | | allowed. | | Subsequence|Search for exact |/SQSN |S AAGGTTACTA/SQSN Nucleic | answers, plus | | Acid | sequences in which | | | the query sequence is | | | embedded. Ambiguity | | | codes for nucleic | | | acids and variability | | | symbols are allowed. | | (1) The families of amino acid equivalents retrieved in protein family searches are: P, A, G, S, T (neutral, weakly hydrophobic) Q, N, E, D, B, Z (hydrophilic, acid amine) H, K, R (hydrophilic, basic) L, I, V, M (hydrophobic) F, Y, W (hydrophobic, aromatic) C (cross-link forming) Variability Symbols for Subsequence Searches (/SQSP, /SQSFP, and /SQSN) (1): Symbol | Function | Search Examples ===========+==================================+================ [ ] |To specify alternate residues |S LGP[VL]/SQSP | |S LGP['VAL''LEU']/SQSP -----------+----------------------------------+---------------- [-] |To exclude a specific residue |S LGP[-H]/SQSP | or alternate residues |S LGP[-'HIS']/SQSPSP | |S LGP[-HL]/SQSP -----------+----------------------------------+---------------- {m} |To repeat the preceding sequence |S (FL){2}/SQSP | or sequence query (L#, E#, or |S L4{2}/SQSP | saved query) m times |S NAME/Q{3}/SQSP | |S (CTG){2}/SQSN | |S TAA(TAAA){2}/SQSN -----------+----------------------------------+---------------- {m,u} |To repeat the preceding sequence |S GG(FL){1,2}/SQSP or | or sequence query (L#, E#, or |S L3{1,3}/SQSP {m-u} | saved query) m to u times |S NAME/Q{1,4}/SQSP | |S (CTG){1,3}/SQSN -----------+----------------------------------+---------------- ? |To repeat the preceding sequence |S FLRRI(RP)?K/SQSP or | or sequence query (L#, E#, or |S FLRRI(RP){0,1}K/SQSP {0,1} | saved query) zero or one time |S L1{0-1}NN/SQSP or | |S NAME/Q{0,1}NN/SQSP {0-1} | |S CAT(CGA){0,1}GGAC/SQSN -----------+----------------------------------+---------------- * |To repeat the preceding sequence |S KLK(WD){0,}N/SQSP or | or sequence query (L#, E#, or |S KLK(WD)*N/SQSP {0,} | saved query) zero or more times |S L1{0-}NN/SQSP or | |S NAME/Q{0,}NN/SQSP {0-} | |S CAT(CTG){0,}TATT/SQSN -----------+----------------------------------+---------------- + |To repeat the preceding sequence |S KLK(DLE){1,}/SQSP or | or sequence query (L#, E#, or |S KLK(DLE)+/SQSP {1,} | saved query) one or more times |S L2{1-}/SQSP or | |S NAME/Q{1,}/SQSP {1-} | |S CAT(CTG){1,}TATT/SQSN -----------+----------------------------------+---------------- & |To join together sequence |S L1&L3/SQSFP | expressions or queries |S L2&L5{1,3}/SQSP | (L#s, E#s, or saved queries) |S NAME1/Q{2}&NAME2/Q/SQSP | |S E1&E3/SQSP In addition, the caret and the vertical bar may be used. The caret is used at the beginning or at the end of a sequence to search for that sequence at the beginning or end of sequence field. The vertical bar is the symbol for alternation, i.e., it is used to separate alternate sequence queries. (1) For more information on specifying variability in subsequence queries, enter 'HELP SQQ' at an arrow prompt in the Registry File. Specifying Gaps in Subsequence Searches (/SQSP, /SQSFP, and /SQSN): Symbol | Function | Search Examples ========+================================+====================== . |A gap of one residue |S SY.RPG/SQSP | |S SY..RPG/SQSPS | |S AAG...TGC/SQSN --------+--------------------------------+---------------------- .{m} |A gap of m residues |S SY.{2}RPG/SQSP or | |S SY[2.]RPG/SQSP [m.] | | --------+--------------------------------+---------------------- .{m,u}|A gap of m to u residues |S GFF.{2,10}LSS/SQSP or | |S GFF.{2-10}LSS/SQSP .{m-u}| |S AAG.{2,5}TGC/SQSN --------+--------------------------------+---------------------- : |A gap of zero or one residues |S AGA:SRI/SQSFPS or | |S AGA.?SRI/SQSFP .? | |S AGA.{0,1}SRI/SQSFP or | |S AGA.{0-1}SRI/SQSFP .{0,1}| | or | | .{0-1}| | --------+--------------------------------+---------------------- .* |A gap of zero or more residue |S HLC.*TYG/SQSP or | |S HLC.{0,}TYG/SQSP .{0,} | |S HLC.{0-}TYG/SQSP or | |S AAGGCAGATG.*GCAA/SQSN .{0-} | | --------+--------------------------------+---------------------- .+ |A gap of one or more residues |S SY.+TH/SQSFP or | |S SY.{1,}TH/SQSFP .{1,} | |S SY.{1-}TH/SQSFP or | |S TCCTG.+GTGG/SQSN .{1-} | | Display and Print Formats: Any combination of individual field codes and any combination of predefined format codes may be used. However, individual codes may not be combined with system predefined format codes. Multiple codes must be separated by commas or spaces. The fields are displayed or printed in the order requested. Highlighting must be ON during SEARCH in order touse the HIT and KWIC formats. The CM (Component Number) field appears in records for multicomponent substances but it is not a custom display field and cannot be used in display or print requests. Dictionary Formats (1) Format | Content | Examples ========+====================================+================ AF |Alternate Molecular Formula |D L4 1-4 AF AR |Alternate Registry Number |D L1 3 AR CCI |Component Class Identifier |D CCI 1,3-5 CCN (2) |Condensed Chemical Name |D 20 CCN CDES |Component Descriptor |D CDES 5-10 CI |Substance Class Identifier |D 1-3,7,8 CI CIL |Component Isotope at Unknown |D CIL | Location | CMF |Component Molecular Formula |D L1 CMF 3 CN |Chemical Name |D CN COMP(3) |Composition |D L7 CRN |Component Registry Number |D 1,3,6 CRN L5 DEF |Definition |D DEF DES |Descriptor |D DES 2 DR |Deleted Registry Number |D L8 DR 1-3 FCN (2) |Full Chemical Name |D FCN L3 7 FS |File Segment |D 1,4 FS IL |Isotope at Unknown Location |D IL IN |CA Index Name |D IN L1 4 LC |Registry Number Locator |D LC 3,4 MF |Molecular Formula |D MF PCT |Polymer Class Term |D L3 PCT PR |Preferred Registry Number |D 5,3 PR REF |Number of references in CA, CAOLD, |D REF | CAplus | RN |CAS Registry Number |D L4 RN 3 RR |Replaced Registry Number |D L3 2 RR RSD (4) |Ring System Data |D RSD SCN (5) |Short Chemical Name |D 5-9 SCN SR |Source of Registration |D SR 1,3 L12 SRSD (6)|Short Ring System Data |D SRSD STF |Flat Structure (no stereo indicated)|D L9 1 3 STR (7) |Structure Diagram (includes stereo |D L4 STR | bonds and R/S/E/Z labels when | | available) | STS (7) |Stereo Structure (includes stereo |D STS | bonds when available) | --------+------------------------------------+---------------- ALL |All available fields and names, |DISPLAY L1 1 ALL | including biosequence data and | | BIB, ABS, IND for 10 most recent CA| | references | FIDE |All available names and all |D FIDE 3 7 L6 | substance data, except biosequence | | data (RN, CN, DEF, AR, PR, FS, DR, | | RR, MF, AF, CI, PCT, SR, LC, IL, | | DES, RSD, CRN, CMF, CCI, CDES, CIL,| | STR, COMP, REF) | IDE |Same as FIDE, except only 50 names |D IDE L10 | are displayed and RSD is not | | displayed (IDE is the default) | REG |CAS Registry Number(s) |D REG | (RN, DR, AR, PR, RR) | SAM |IN, SQL, MF, CI, STR, COMP |D L3 1-18 SAM SCAN (8)|IN, SQL, MF, CI, STR, COMP |D SCAN | (answer numbers are are not | | displayed and the answers are | | displayed in random order) | --------+------------------------------------+---------------- HIT (9) |All fields containing hit terms |D HIT 5-10 KWIC (9)|All hit terms plus 20 words on |D KWIC 5-10 | either side | (1) In addition to these substance field codes and formats, bibliographic information for the ten most recent documents that cite the substance in CA can be displayed if combined with at least one substance field, e.g., D RN TI AU. The substance code or format must be given first. The bibliographic formats are found on the CA File summary sheet. (2) Names are displayed with CN code. (3) This is a tabular display that lists composition information and Component Registry Numbers for alloys and tabular inorganic substances. (4) This is a tabular display that lists EA, ES, SZ, RF, RID, and RID Occurrence Count. (5) The CA Index Name and all OTHER NAMES are displayed with CN code. (6) This is a tabular display that lists EA, RID, and RID Occurrence Count. (7) Stereo structure diagrams are only available on graphics terminals and offline prints. (8) No online display charge for this option. SCAN must be specified on the command line, i.e., D SCAN or DISPLAY SCAN. (9) HIT and KWIC are available for all dictionary fields except MAC, RC, and CRN, and in all biosequence fields. KWIC is the same as HIT for all fields except DEF and LC. The entire field containing hit terms is highlighted except for DEF and LC in which the individual terms are highlighted. The entire RSD table is displayed without highlighting. For NTE, row(s) of the table containing the hit terms is displayed without highlighting. For SEQ and SEQ3, the amino acid codes causing the hit is highlighted by underlining and also by a statement of their position in the sequence. Biosequence Formats (1) Format | Content | Examples ========+====================================+================ NA |Nucleic Acid |D 6 9 11 NA NTE |Note |D NTE SEQ |Sequence (1-letter codes) |D SEQ SEQ3 |Sequence (3-letter codes) |D SEQ3 1-10 SQL |Sequence Length |D L3 SQL --------+------------------------------------+---------------- SQD |RN, AR, PR, DR, RR, FS, SQL, NTE, |D 5 SQD |SEQ | SQD3 |RN, AR, PR, DR, RR, FS, SQL, NTE, |D 2-4 SQD3 |SEQ3 | SQIDE |RN, CN, DEF, AR, PR, DR, RR, FS, |D L4 SQIDE |SQL, NA, NTE, SEQ, MF, AF, CI, PCT, | |SR, LC, IL, DES, STR, REF | SQIDE3 |Same as SQIDE except that 3-letter |D L4 SQIDE3 |codes are used for protein sequences| SQN |RN, CN, AR, PR, FS, SQL, DR, RR, REF|D SQN L5 6-9 (1) In addition to these substance field codes and formats, bibliographic information for the ten most recent documents that cite the substance in CA can be displayed if combined with at least one substance field, e.g., D RN TI AU. The substance code or format must be given first. The bibliographic formats are found on the CA File summary sheet. SLECT and SORT Fields The SELECT command is used to create E-numbers or an L-number containing terms taken from the specified field in an answer set. The SORT command is used to rearrange the search results in either alphabetic or numeric order of the specified field(s). Field Name Field Code SELECT (1) SORT Alternate Molecular Formula AF Y (2) N Alternate Registry Number AR Y (3) N CA Index Name IN Y (4) Y CAS Registry Number RN Y Y Chemical Name CN Y (5) N Class Identifier CI Y N Component Class Identifier CCI Y (6) N Component Molecular Formula CMF Y (7) N Component Registry Number CRN Y N Definition DEF Y N Deleted Registry Number DR Y (3) N Elemental Analysis for EA Y N Ring System Elemental Sequence for ES Y N Ring System File Segment FS Y Y Full Chemical Name FCN Y N Molecular Formula MF Y N Names NAME Y (8) N Nucleic Acid Sequence SQEN Y N (exact search form) Nucleic Acid Sequence SQSN Y N (subsequence search form) Polymer Class Term PCT Y N Preferred Registry Number PR Y (3) N Protein Sequence SQEFP Y N (exact family search form) Protein Sequence SQEP Y N (exact search form) Protein Sequence SQSFP Y N (subsequence family search form) Protein Sequence SQSP Y N (subsequence search form) References REF N Y Registry Number Locator LC Y (9) N Registry Numbers and Names CHEM Y (10) N (default) Replacing Registry Number RR Y (5) N Ring Identifier RID Y N Ring System Formula RF Y N Sequence (1-letter codes) SEQ Y N Sequence (3-letter codes) SEQ3 Y N Sequence Length SQL N Y Short Chemical Names SCN Y (4) N Size for the Ring System SZ Y N Source of Registration SR Y N (1) HIT may be used to restrict terms extracted to terms that match the search expression used to create the answer set, e.g., SEL HIT CN. (2) /MF is appended. (3) /RN is appended. (4) /CN is appended. (5) CA Index Name, first 50 names in alphabetical order, and any additional hit names are selected. (6) /CI is appended. (7) /BI is appended. (8) All names except inverted names are selected and /BI is appended. For nucleic acids from the GenBank database, SELECT NAME extracts GenBank Locus ID and GenBank numbers. GenBank numbers may be used as search terms in the GenBank File or MEDLINE File. (9) E-numbers containing the files listed in this field may be used in the FILE and INDEX commands in place of the file names. (10) AR, DR, PR, RN, RR, and all names except inverted names are selected and /BI is appended. Sample Records: DISPLAY IDE (default) RN 57-88-5 REGISTRY CN Cholest-5-en-3-ol (3.beta.)- (9CI) (CA INDEX NAME) OTHER CA INDEX NAMES: CN Cholesterol (8CI) OTHER NAMES: CN (-)-Cholesterol CN .DELTA.5-Cholesten-3.beta.-ol CN 3.beta.-Hydroxycholest-5-ene CN 5:6-Cholesten-3.beta.-ol CN Cholest-5-en-3.beta.-ol CN Cholesterin CN Cholesteryl alcohol CN Dythol CN Lidinit CN Lidinite CN Provitamin D FS STEREOSEARCH MF C27 H46 O CI COM LC STN Files: ANABSTR, BEILSTEIN*, BIOBUSINESS, BIOSIS, CA, CAOLD, CAPLUS, CAPREVIEWS, CASREACT, CEN, CHEMINFORMRX, CHEMLIST, CBNB, CIN, CJACS, CSCHEM, CSNB, DDR, DETHERM*, DRUGR, DRUGU, EMBASE, GMELIN*, HODOC*, IFICDB, IFIPAT, IFIUDB, IPA, MEDLINE, MRCK*, MSDS-OHS, MSDS-SUM, NAPRALERT, PDLCOM*, PIRA, PNI, PROMT, RTECS*, SPECINFO, TOXLINE, TOXLIT, USAN, VTB (*File contains numerically searchable property data) Other Sources: DSL**, EINECS**, TSCA** (**Enter CHEMLIST File for up-to-date regulatory information) DES 4:3B.CHOLEST Me . . . CH .....(CH2)3.........CHMe2 . . . Me . . . . C . . . C . C. . C. . C Me . . . . . . . C . C C........C . . . . . . C. .C. .C. . . . . . . C C. C . . . : . HO . .C. :C. STEREO DIAGRAM AVAILABLE WITH GRAPHICS TERMINAL OR OFFLINE PRINT 49773 REFERENCES IN FILE CAPLUS 49543 REFERENCES IN FILE CA (1967 TO DATE) 5820 REFERENCES TO NON-SPECIFIC DERIVATIVES IN FILE CA 15 REFERENCES IN FILE CAOLD (PRIOR TO 1967) DISPLAY IDE (default) RN 91386-77-5 REGISTRY CN Interferon .alpha.1 (human leukocyte protein moiety reduced), 1-L-serine- (9CI) (CA INDEX NAME) FS PROTEIN SEQUENCE MF Unspecified CI MAN LC STN Files: CA, CAPLUS *** STRUCTURE DIAGRAM IS NOT AVAILABLE *** *** USE 'SQD' OR 'SQIDE' FORMATS TO DISPLAY SEQUENCE *** 1 REFERENCES IN FILE CA (1967 TO DATE) 1 REFERENCES IN FILE CAPLUS DISPLAY SQIDE (Protein Sequence Record) RN 91386-77-5 REGISTRY CN Interferon .alpha.1 (human leukocyte protein moiety reduced), 1-L-serine- (9CI) (CA INDEX NAME) FS PROTEIN SEQUENCE SQL 166 SEQ 1 SDLPETHSLD NRRTLMLLAQ MSRISPSSCL MDRHDFGFPQ EEFDGNQFQK 51 APAISVLHEL IQQIFNLFTT KDSSAAWDED LLDKFCTELY QQLNDLEACV 101 MQEERVGETP LMNADSILAV KKYFRRITLY LTEKKYSPCA WEVVRAEIMR 151 SLSLSTNLQE RLRRKE MF Unspecified CI MAN LC STN Files: CA, CAPLUS 1 REFERENCES IN FILE CA (1967 TO DATE) 1 REFERENCES IN FILE CAPLUS DISPLAY SQIDE (Nucleic Acid Sequence Indexed by CAS) RN 91449-61-5 REGISTRY CN Deoxyribonucleic acid (Tikaut provirus 5'-long terminal repeat) (9CI) (CA INDEX NAME) FS NUCLEIC ACID SEQUENCE SQL 641 NA 186 a 170 c 160 g 125 t NTE doublestranded SEQ 1 tgaaagaccc caccataagg cttagcaagc tagctgcagt aacgccattt 51 tgcaaggcat gaaaaagtac cagagctgag ttctcaaagt caacaacgaa 101 gtttagttaa agaataaggc tgaacaaaac tgggacaggg gccaaacagg 151 atatctgtgg tcgagcagct agggccccgg ctcagggcca agaacagatg 201 gtactcagat aaagcgaagg gctgaacaaa acgggacagg ggccaaacag 251 gatgggggcc aaacaggata tctgtggtcg agcacctggg ccccggctca 301 gggccaagaa cagatggtac tcagataaag cgaaactaac aacagtttct 351 ggaaagtccc acctcagttt caagttcccc aaaagaccgg gaaaaacccc 401 aagccttatt taaactaacc aatcagctcg cttctcgctt ctgtaacccg 451 cgctttttgc tcccagccct ataaaaaggg taaaaacccc acactcggcg 501 ccccagtcct ccgatagact gagtcgcccg ggtacccgtg tatccaataa 551 agccttttgc tgttgcatcc gaatcgtggt ctcgctgatc cttgggaggg 601 tctcctcaga gtgattgact gcccagcctg ggggtctttc a MF Unspecified CI MAN LC STN Files: CA, CAPLUS, TOXLIT 1 REFERENCES IN FILE CA (1967 TO DATE) 1 REFERENCES IN FILE CAPLUS DISPLAY SQIDE (Nucleic Acid Sequence Registered from GenBank.RTM.) RN 139065-61-5 REGISTRY CN GenBank M12334 (9CI) (CA INDEX NAME) FS NUCLEIC ACID SEQUENCE SQL 35 NA 6 a 8 c 14 g 7 t NTE doublestranded SEQ 1 ggaggctcat ttgcagttga ggccagcagg tcggc MF C342 H428 N138 O209 P34 CI MAN SR GenBank LC STN Files: GENBANK DES 5:ALL,B-D-ERYTHRO DISPLAY FIDE RN 53784-90-0 REGISTRY CN .gamma.-Cyclodextrin, 6A,6B,6C,6D,6E,6F,6G,6H-octadeoxy- (9CI) (CA INDEX NAME) OTHER CA INDEX NAMES: CN 2,4,7,9,12,14,17,19,22,24,27,29,32,34,37,39- Hexadecaoxanonacyclo[36.2.2.23,6.28,11.213,16.218,21.223,26.228,31.2 33,36]hexapentacontane, .gamma.-cyclodextrin deriv. (9CI) MF C48 H80 O32 LC STN Files: BEILSTEIN*, CA, CAPLUS (*File contains numerically searchable property data) DES 6:GAMMA-CYCLODEXTRIN Ring System Data Elemental ! Elemental ! Size of !Ring System! Ring ! RID Analysis ! Sequence ! the Rings ! Formula !Identifier!Occurrence EA ! ES ! SZ ! RF ! RID ! Count ============+=============+============+===========+==========+========== C5O-C5O-C5O-!OC5-OC5-OC5- !6-6-6-6-6-6-!C40O16 !14246.1.1 !1 C5O-C5O-C5O-!OC5-OC5-OC5- !6-6-40 ! ! ! C5O-C5O- !OC5-OC5- ! ! ! ! C24O16 !OCOC2OCOC2OCO! ! ! ! !C2OCOC2OCOC2O! ! ! ! !COC2OCOC2OCOC! ! ! ! !2 ! ! ! ! OH OH . . . . OH Me . . . . HO .C. .O . C. . . O. . . . . . . . . . . . . .C. .C . . C. . C C. .C . . . . . . . . . . . . OH . . . . . . . . C . C C C . C . . .O. . . . . . . . . . . . . . . . . . . . . . . .C Me .O .O C . . . . O. . HO . . .. . C. .OH . C. .O. .C . . . . . . . . . . . . . . . . . C. . .C. O. . .C. .C. . .C. Me . . . . . . . . . . OH . . Me . . OH . . . . . . . O. .C. .C .C. O. .C. . . . . . . . . . . . C . . O. . C. .OH .C . . . . . . . . . . . . . . . Me OH Me Page 1-A OH . . . .O .C. OH . . . . . C. . C. . . . . . . C C . . . . . . . . Me .O . . O . .OH . C. . . . . . O. . .C . . . . Me . . . .C. .C. . . . . . O. . C. .OH . . . . . . OH Page 1-B 1 REFERENCES IN FILE CA (1967 TO DATE) 1 REFERENCES IN FILE CAPLUS DISPLAY CCN CN Methanaminium, N-[4-[[4-(dimethylamino)phenyl]phenylmethylene]-2,5- cyclohexadien-1-ylidene]-N-methyl-, chloride (9CI) (CA INDEX NAME) OTHER CA INDEX NAMES: CN C.I. Basic Green 4 (8CI); Victoria Green WB (6CI) OTHER NAMES: CN Acryl Brilliant Green B; ADC Malachite Green Crystals; Aizen Malachite Green; Aizen Malachite Green Crystals; Aniline green; Astra Malachite Green; Astra Malachite Green B; Astra Malachite Green BXX; Atlantic Malachite Green; Basic Green 4; Basonyl Green 830; Benzal Green; Benzaldehyde green; Bronze Green Toner A 8002; Burma Green B; C.I. 42000; Calcozine Green V; China Green; Diabasic Malachite Green; Diamond Green B extra; Diamond Green BX; Diamond Green P Extra; Green MX; Grenoble Green; Hidaco Malachite Green Base; Hidaco Malachite Green LC; Hidaco Malachite Green SC; Light Green N; Lincoln Green Toner B 15-2900; Malachite green; Malachite Green A; Malachite Green AN; Malachite Green B; Malachite green chloride; Malachite Green CP; Malachite Green Crystals; Malachite Green Crystals BPC; Malachite Green J 3E; Malachite Green Powder; Malachite Green WS; Malachite Lake Green A; Mitsui Malachite Green; New Victoria Green Extra I; New Victoria Green Extra II; New Victoria Green Extra O; Oji Malachite Green; Solid Green Crystals O; Solid Green O; Super Ick Cure; Tertrophene Green M; Tokyo Aniline Malachite Green; Verona Basic Green M; Victoria Green; Victoria Green (basic dye); Victoria Green B; Victoria Green S; Victoria Green WPB