NUCLEIC ACID DATABASE NEWSLETTER -------------------------------- Number 2. February 1992. C------W C------W C--------W NNN NNN DDDDDD BBBBBBB C--------W C--------W NNNN NNN DDDDDDDD BBBBBBBBB C--------W C-------W NNNNN NNN DDD DDD BBB BBB C-------W C----W NNN NN NNN DDD DDD BBBBBBB C----W W NNN NNNNN DDD DDD BBB BBB W W--C NNN NNNN DDDDDDDD BBBBBBBBB W--C W-----C NNN NNN DDDDDD BBBBBBB W-----C W--------C W--------C W---------C W---------C W--------C W--------C W-----C THE NUCLEIC ACID DATABASE: W-----C C -------------------------- C C--W C--W C------W A RELATIONAL DATABASE OF NUCLEIC C------W C--------W ACID CRYSTAL STRUCTURES C--------W C--------W C--------W INTRODUCTION: ------------ This second newsletter of the Nucleic Acid Database updates the information and data presented in the first newsletter, in December, 1991. Since then, five new DNA structures have been added, and the coordinates of three new structures have been released. The current release consists of updated tables describing the 174 DNA and RNA structures now contained in the database, and coordinate files for all structures where coordinates are available. The Nucleic Acid Database Project represents a collaborative effort between Helen Berman and Wilma Olson of Rutgers University and David Beveridge of Wesleyan University, and is funded by the Division of Instrumentation and Resources of the National Science Foundation (DIR 90 12772). If you use the tables, please use the following citation: The Nucleic Acid Database: A Guide to the Use of a Relational Database of Nucleic Acid Crystal Structures, 1991. Technical Report. Helen M. Berman, Wilma K. Olson, John Westbrook, Anke Gelbin, Tamas Demeny, Shuhsin Hsieh, Center for Computational Chemistry, Rutgers University, New Brunswick, NJ and David Beveridge, Wesleyan University, Middletown, CN. Page 2 WHAT IS NEW? ------------ 1. NEW STRUCTURES: The following new structures have been added to NDB: GDL010 5'-D(*CP*GP*CP*GP*AP*AP*TP*TP*CP*GP*CP*G)-3', HOECHST 33258, 0 C, PIPERAZINE UP GDL011 5'-D(*CP*GP*CP*GP*AP*AP*TP*TP*CP*GP*CP*G)-3', HOECHST 33258, 0 C, PIPERAZINE DOWN GDL012 5'-D(*CP*GP*CP*GP*AP*AP*TP*TP*CP*GP*CP*G)-3', HOECHST 33258, -25 C, PIPERAZINE DOWN GDL013 5'-D(*CP*GP*CP*GP*AP*AP*TP*TP*CP*GP*CP*G)-3', HOECHST 33258, -100 C, PIPERAZINE DOWN ZDF029 5'-D(*CP*GP*CP*GP*CP*G)-3', SPERMINE 2. NEW COORDINATES: Coordinates for the following structures have been released: ADHP36 5'-D(*GP*CP*CP*CP(M)*GP*GP*GP*C)-3', G5 CARRIES A CHARGED, ACHIRAL METHYLENE PHOSPHONATE BDLB26 5'-D(*CP*GP*CP*(M)GP*AP*AP*TP*TP*TP*GP*CP*G)-3' UDM010 5'-D(*CP*GP*CP*AP*GP*AP*AP*TP*TP*CP*GP*CP*G)-3' 3. COORDINATE FILES AVAILABLE: The coordinate files of all released coordinates are now available from NDB electronically, from the NUCLEIC ACID DATABASE LIBRARY (NDBLIB). To obtain a coordinate file through the electronic mail server, send the following mail message: To: ndblib@helix.rutgers.edu Subject: send from coordinates Where the word is substituted by the NDB File Name of the structure desired. For example, to obtain coordinates for the structure called addb01, you would send this mail message: To: ndblib@helix.rutgers.edu Subject: send addb01 from coordinates For a brief description of each structure available, refer to the released table called 'names'. For more detailed instructions on how to communicate with the NDB Library, refer below, to the ACCESS INFORMATION section of this newsletter. The coordinates are released in the orthogonal x, y, z form. Page 3 NOTE ON SYMMETRY GENERATED SECOND STRANDS: The coordinate files for the structures contain coordinates for the asymmetric unit only. In case a structure has only one strand per asymmetric unit, you can obtain the symmetry-generated second strand from a second file for this structure. This file is called the same as the first, with the letter 's' at the end. For example, the structure adh008 has two files associated with it: adh008 and adh008s File adh008 contains coordinates for the asymmetric unit ONLY. File adh008s contains coordinates for the asymmetric unit AND for the symmetry-generated second strand. 4. NEW TABLES: TORSION ANGLES: Tables containing the torsion angles have been put into the electronic library for type A, B, and Z DNA's. These tables have names of the following format: torsion_.ps -- for postscript versions, and torsion_.ascii -- for ascii text versions where the word is substituted by the NDB File Name of the structure desired. For detailed instructions on how to obtain these tables, refer below, to the ACCESS INFORMATION section of this newsletter. 5. UPDATED TABLES: All the originally released tables have been updated to include information about the newly added structures. 6. SUBSCRIPTIONS: To subscribe to the Nucleic Acid Database mailing list, send the following mail message: To: ndblib@helix.rutgers.edu Subject: subscribe To subscribe for someone else, send a mail message to ndbadmin@helix.rutgers.edu indicating the person's name and electronic mail address. Page 4 STAFF OF THE NUCLEIC ACID DATABASE PROJECT: ------------------------------------------ Helen M. Berman, Wilma K. Olson, David Beveridge--Principal Investigators John Westbrook--Director, Center for Computational Chemistry, Rutgers University Anke Gelbin--Database Coordinator Tamas Demeny--Database Programmer Shuhsin Hsieh--Database Programmer Donna Horton--Staff Assistant Page 5 ACCESS INFORMATION: ------------------ The released information is located in the NUCLEIC ACID DATABASE LIBRARY (NDBLIB). You can communicate with this library in two ways: 1. ELECTRONIC MAIL (MAIL SERVER) 2. FILE TRANSFER PROGRAM (ANONYMOUS FTP) 1. MAIL SERVER INSTRUCTIONS: The electronic mail addresses of NDBLIB are: Internet: ndblib@helix.rutgers.edu ndblib@rutchm.rutgers.edu EARN/BITNET: ndblib@rutchm Please note that in the following examples, the Subject line of your mail message is read by the computer; therefore, it is important to follow these instructions carefully. No message other than the Subject line needs to be sent. To obtain general information about NDBLIB, send the following mail message: To: ndblib@helix.rutgers.edu Subject: send index The NDBLIB is currently divided into four libraries: A. newsletter: This library contains this newsletter, and all future and previous newsletters in computer files. To get a list of files (index) in this library, send the following mail message: To: ndblib@helix.rutgers.edu Subject: send index from newsletter To get a copy of a newsletter itself, send this message, substituting the word by the name of the file containing the desired newsletter, as described in the index: To: ndblib@helix.rutgers.edu Subject: send from newsletter Page 6 FOR EXAMPLE: The index you receive from the library called newsletter will contain this information about the first newsletter: File Name: Description: ----------------------------------------------------------------- news.dec91 Introductory newsletter, released in December 1991. news.feb92 February newsletter update. Therefore, to get a copy of the newsletter, you would send the following message: To: ndblib@helix.rutgers.edu Subject: send news.feb92 from newsletter All files in this library are ascii textfiles, meaning that they can be listed and printed directly. B. reports_ascii: This library contains the ascii text versions of the tables currently available. You can find a description of the contents of the tables at the end of this newsletter. All files in this directory are ascii files, and so they can be observed and printed directly. To get a list of the released tables, their descriptions, and the names of the files containing them, send the following mail message: To: ndblib@helix.rutgers.edu Subject: send index from reports_ascii To get an ascii table itself, send this message, substituting the word by the name of the file containing the desired table, as described in the index: To: ndblib@helix.rutgers.edu Subject: send from reports_ascii NOTE: These ascii files are 130 characters wide, therefore, the lines will be broken up if you view them on an 80 characters wide computer screen. To get a nicer view of the tables, you should use a screen that is at least 130 characters wide. When printing hard copies, care should be taken to print them in a landscape format (the longer side of the paper should be on the top and bottom). Page 7 C. reports_ps: This library contains the PostScript versions of the tables currently available. You can find a description of the contents of the tables at the end of this newsletter. All files containing tables in this directory are PostScript files. To view these files, you must print them out on a printer that is able to handle PostScript files. To get a list of the released tables, their descriptions, and the names of the files containing them, send the following mail message: To: ndblib@helix.rutgers.edu Subject: send index from reports_ps To get a PostScript table itself, send this message, substituting the word by the name of the file containing the desired table, as described in the index: To: ndblib@helix.rutgers.edu Subject: send from reports_ps D. coordinates: This library contains the coordinate files of structures for which coordinates are currently available. All files in this directory are ascii files, and so they can be observed and printed directly. To get a list of the structures for which coordinates are stored in this directory, send the following mail message: To: ndblib@helix.rutgers.edu Subject: send index from coordinates To get a coordinate file itself, send this message, substituting the word by the name of the file containing the desired structure, as described in the index. Note that the file name containing the coordinates is the same as the NDB file name (NDB ID) for that structure. To: ndblib@helix.rutgers.edu Subject: send from coordinates Problems or questions concerning the NDBLIB server, should be addressed to: ndbadmin@helix.rutgers.edu Page 8 2. ANONYMOUS FTP: The ftp allows you to transfer files directly from the library into your own directory. Following are some basic instructions on how to access the library via ftp. For a complete description of how to use ftp in general, refer to the manual of your particular computer system. Since helix uses the UNIX system, you will issue UNIX commands when using anonymous ftp. Start ftp by typing: ftp helix.rutgers.edu Log in as: anonymous For the password: USE YOUR OWN USERNAME To access NDBLIB, you first have to enter its main directory, called 'pub'. Do this by typing the following command: cd pub Now you have the opportunity to choose one of the three libraries described in the MAIL SERVER instruction section. To do this, you choose a library (subdirectory) by typing one of the following commands: cd newsletter cd reports_ascii cd reports_ps cd coordinates To obtain a listing of the files in your current library (subdirectory) on the screen, you can type: ls To obtain a complete description of the files in this library (subdirectory), type: get index This will create a file in your directory, called index. To transfer any of the files to your directory, type: get Where is the name of the file, as it appears in the listing and in the index. To exit the ftp program, type: quit See example session on next page. Page 9 Following is an example session using the anonymous ftp. Your system responses might be slightly different. Start ftp by typing: % ftp helix.rutgers.edu In this example, type only the words immediately following the 'greater than' sign (>), except on the line which says 'Password:', type your own name following the colon. The explanations on the right hand side, following the # signs, do not appear on the screen, they are only added here for clarifiation. All other lines are system responses only. Your system may use 'Helix.Rutgers.Edu>' as a prompt, instead of the 'ftp>' appearing in this example. Connected to helix.rutgers.edu. 220 helix FTP server (IRIX version 5.46 Nov 16 1990 19:11) ready. Remote system type is UNIX. Using binary mode to transfer files. ftp> user anonymous #log in as anonymous 331 Guest login ok, send ident as password. Password: #type your own username 230 Guest login ok, access restrictions apply. ftp> cd pub #always start with this 250 CWD command successful. ftp> cd reports_ascii #type one of these: #cd newsletter #cd reports_ascii #cd reports_ps #cd coordinates 250 CWD command successful. ftp> get index #type in the name of the #file you want to get #or type 'ls' for a list #of files. local: index remote: index 200 PORT command successful. 150 Opening BINARY mode data connection for 'index' (6061 bytes). 226 Transfer complete. 6061 bytes received in 0.07 seconds (84.56 Kbytes/s) ftp> get struct.ascii #get another file local: struct.ascii remote: struct.ascii 200 PORT command successful. 150 Opening BINARY mode data connection for 'struct.ascii' (21199 bytes). 226 Transfer complete. 21199 bytes received in 0.04 seconds (517.55 Kbytes/s) ftp> quit #end session with ftp 221 Goodbye. % Page 10 THE CURRENT RELEASE: ------------------- This release consists of the following tables available in both PostScript and text files in the reports_ps and reports_ascii libraries, respectively. The coordinate files in ascii format are located in the library called coordinates. TABLE 1: NAMES This table describes the naming conventions for all structures. It includes the following information: -File name used by Nucleic Acid Database (NDB), -File name used by Protein Data Bank (PDB), -File name used by Cambridge Data Bank (CSD), -Compound Name, -Coordinate status of structure in NDB (Y = available, P = in preparation, * = not available). The table is located in the following files in the library: File name in File name in Contents: reports_ps: reports_ascii: ---------------- -------------- --------------------------------------- names_a.ps names_a.ascii -DNA's and RNA's with structure type A names_b.ps names_b.ascii -DNA's with structure type B (with no drugs bound) names_d.ps names_d.ascii -DNA-DRUG complexes and RNA-DRUG complexes names_g.ps names_g.ascii -DNA-GROOVE-BINDER complexes names_u.ps names_u.ascii -DNA's and RNA's with unusual structure types names_z.ps names_z.ascii -DNA's (and DNA/RNA hybrids) with structure type Z names.ps names.ascii -This file contains all six of the above tables in one. IMPORTANT NOTE: This table should be used in conjunction with all of the other tables. It is your key to identify the structures in the database, according to any convenient naming convention. Page 11 TABLE 2: CITATIONS This table lists the complete citation of the primary references for all structures. The table is located in the following files in the library: File name in File name in Contents: reports_ps: reports_ascii: -------------- ---------------- -------------------------------------- citation_a.ps citation_a.ascii -DNA's and RNA's with structure type A citation_b.ps citation_b.ascii -DNA's with structure type B (with no drugs bound) citation_d.ps citation_d.ascii -DNA-DRUG complexes and RNA-DRUG complexes citation_g.ps citation_g.ascii -DNA-GROOVE-BINDER complexes citation_u.ps citation_u.ascii -DNA's and RNA's with unusual structure types citation_z.ps citation_z.ascii -DNA's (and DNA/RNA hybrids) with structure type Z citation.ps citation.ascii -This file contains all six of the above tables in one. Page 12 TABLE 3: STRUCTURE SUMMARY This table includes the following information: -NBD file name of structure, -Sequence of chain, -Structure type (A, B, Z, RH = right handed, U = unusual), -Base modification (yes/no), -Phosphate modification (yes/no), -Mismatch (yes/no), -Drug name (for structures with drugs). The table is located in the following files in the library: File name in File name in Contents: reports_ps: reports_ascii: -------------- ---------------- -------------------------------------- struct.ps struct.ascii -All structures. Page 13 TABLE 4: DRUGS This table includes the following information: -NDB file name of structure, -Sequence of the chain, -Chemical name of drug, -Binding type. The table is located in the following files in the library: File name in File name in Contents: reports_ps: reports_ascii: -------------- ---------------- -------------------------------------- drugs.ps drugs.ascii -Structures with drugs Listed in this order: binding type INTERCALATION, binding type MINOR GROOVE BINDER, other binding types. Page 14 TABLE 5: MISMATCHES This table includes the following information: -NDB file name of structure, -Compound name, -Mismatched residue of the chain 'A', -Mismatched residue of the chain 'B'. The table is located in the following files in the library: File name in File name in Contents: reports_ps: reports_ascii: -------------- ---------------- -------------------------------------- mismatch.ps mismatch.ascii -Structures with mismatches. Page 15 TABLE 6: BASE MODIFIERS This table includes the following information: -NDB file name of structure, -Sequence of chain, -Name of base modifier, -Name of atom to which base modifier is connected, -Residue of the atom to which base modifier is connected, -Name of chain ('A' or 'B'). The table is located in the following files in the library: File name in File name in Contents: reports_ps: reports_ascii: -------------- ---------------- -------------------------------------- basemod.ps basemod.ascii -Structures with base modifications. Page 16 TABLE 7: PHOSPHATE MODIFIERS This table includes the following information: -NDB file name of structure, -Sequence of chain, -Name of phosphate modifier, -Name of atom replaced by this modifier, -Residue of the atom replaced, -Name of chain ('A' or 'B'). The table is located in the following files in the library: File name in File name in Contents: reports_ps: reports_ascii: -------------- ---------------- -------------------------------------- phosmod.ps phosmod.ascii -Structures with phosphate modifications. Page 17 TABLE 8: CELL DIMENSIONS This table includes the following information: -NDB file name of structure, -Compound name, (not available in text file) -Sequence of the chain, (not available in text file) -Cell dimensions a, b, c, -Cell angles alpha, beta, gamma, -Space group, -Coordinate status in NDB (Y = available, P = in preparation, * = not available). The table is located in the following files in the library: File name in File name in Contents: reports_ps: reports_ascii: -------------- ---------------- -------------------------------------- celldim.ps celldim.ascii -All structures. Page 18 TABLE 9: TORSION ANGLES There is a separate torsion table for each A, B, and Z DNA. Each of these tables contain the following torsion angles for each residue in the structure: TORSIONS: GREEK SYMBOL CONVENTIONS: (n-1)O3'- P- O5'- C5' Alpha Omega P- O5'- C5'- C4' Beta Phi O5'- C5'- C4'- C3' Gamma Psi C5'- C4'- C3'- O3' Delta Psi prime C4'- C3'- O3'- P Epsilon Phi prime C3'- O3'- P - O5'(n+1) Zeta Omega prime O4'- C1'- N1/N9-C2/C4 Chi Chi O4'- C1'- N1/N9-C4/C8 Chi Chi (in .ps file only) C4'- O4'- C1'- C2' Nu0 O4'- C1'- C2'- C3' Nu1 C1'- C2'- C3'- C4' Nu2 C2'- C3'- C4'- O4' Nu3 C3'- C4'- O4'- C1' Nu4 The tables are located in the following files in the library: File name in File name in Contents: reports_ps: reports_ascii: -------------- ---------------- -------------------------------------- torsion_.ps torsion_.ascii - Torsion angles for the structure whose NDB file name is substituted for the word . For example: torsion_addb01.ps torsion_addb01.ascii - Torsion angles for structure addb01.