NUCLEIC ACID DATABASE NEWSLETTER -------------------------------- Number 1. December 1991. C------W C------W C--------W NNN NNN DDDDDD BBBBBBB C--------W C--------W NNNN NNN DDDDDDDD BBBBBBBBB C--------W C-------W NNNNN NNN DDD DDD BBB BBB C-------W C----W NNN NN NNN DDD DDD BBBBBBB C----W W NNN NNNNN DDD DDD BBB BBB W W--C NNN NNNN DDDDDDDD BBBBBBBBB W--C W-----C NNN NNN DDDDDD BBBBBBB W-----C W--------C W--------C W---------C W---------C W--------C W--------C W-----C THE NUCLEIC ACID DATABASE: W-----C C -------------------------- C C--W C--W C------W A RELATIONAL DATABASE OF NUCLEIC C------W C--------W ACID CRYSTAL STRUCTURES C--------W C--------W C--------W INTRODUCTION: ------------ This is the first newsletter of the Nucleic Acid Database(NDB), announcing the introductory release of data by the NDB Project. The goal of this project is to assemble and distribute structural information about nucleic acids. Data from the literature and from the Protein Data Bank have been put into a relational database --SYBASE(TM). For the present, tables describing several important properties of the structures will be produced and made available via electronic mail and anonymous ftp. In the future, we plan to make the software available to produce reports describing any of the stored properties of any subset of the structures in the database. The Nucleic Acid Database Project represents a collaborative effort between Helen Berman and Wilma Olson of Rutgers University and David Beveridge of Wesleyan University, and is funded by the Division of Instrumentation and Resources of the National Science Foundation (DIR 90 12772). If you use the tables, please use the following citation: The Nucleic Acid Database: A Guide to the Use of a Relational Database of Nucleic Acid Crystal Structures, 1991. Technical Report. Helen M. Berman, Wilma K. Olson, John Westbrook, Anke Gelbin, Tamas Demeny. Center for Computational Chemistry, Rutgers University, New Brunswick, NJ and David Beveridge, Wesleyan University, Middletown, CN. This first distribution consists of eight tables describing the 169 DNA and RNA structures presently contained in the database. The tRNA crystal structures and the DNA and RNA fiber structures will soon be added. FUTURE PLANS: ------------- PHASE 1--TABLES: Starting February 1, 1992, there will be a regular NDB release, accompanied by updated editions of this newsletter. Initially, these releases, like the present one, will consist of tables prepared at NDB about the contents of the database. The number of tables released will steadily increase, describing more and more information about the structures. As the contents of the database increase, these tables will be updated regularly, to include the properties of all the nucleic acid structures available. The present release of tables contains descriptions of structural properties (for example, structure types, cell dimensions, and modifications) and references for all structures. The tables created in the near future will contain a much wider variety of information, such as crystallization descriptions, data collection methods, and refinement methods. These tables can be inspected, understood, and cross-referenced easily to gather information about a number of properties of a structure or group of structures. The columns of the tables can be easily scanned, to pick out groups of structures having certain common characteristics. Thus, these tables allow you to obtain information about all known nucleic acid structures at a glance. PHASE 2--STRUCTURE FILES: In the future, complete NDB flat files will also be released for all structures. These files contain detailed encoded information about structural properties, crystallization, data collection, and refinement, as well as Cartesian coordinates, occupancies, and temperature factors for the structures where coordinates are publically available. Fractional coordinates will also be available. The set of tables released will be expanded even further to include tables about bond angles, distances, torsion angles, base geometries, and other derived quantities. PHASE 3--PROGRAM: In the final phase of the database project, the NDB database application program will be made available for public use. This program will allow direct access to the database by any user, and facilitate the creation of user-defined reports: tables similar to the ones released now, with any combination of columns and rows as selected and desired. The program will also be able to output coordinate files and calculated quantities, as prompted by the user. STAFF OF THE NUCLEIC ACID DATABASE PROJECT: ------------------------------------------ Helen M. Berman, Wilma K. Olson, David Beveridge--Principal Investigators John Westbrook--Director, Center for Computational Chemistry, Rutgers University Anke Gelbin--Database Coordinator Tamas Demeny--Database Programmer Donna Horton--Staff Assistant ACCESS INFORMATION: ------------------ The released information is located in the NUCLEIC ACID DATABASE LIBRARY (NDBLIB). You can communicate with this library in two ways: 1. ELECTRONIC MAIL (MAIL SERVER) 2. FILE TRANSFER PROGRAM (ANONYMOUS FTP) 1. MAIL SERVER INSTRUCTIONS: The electronic mail addresses of NDBLIB are: Internet: ndblib@helix.rutgers.edu ndblib@rutchm.rutgers.edu EARN/BITNET: ndblib@rutchm To obtain general information about NDBLIB, send the following mail message: To: ndblib@helix.rutgers.edu Subject: send index The NDBLIB is currently divided into three libraries: A. newsletter: This library contains this newsletter, and all future newsletters in computer files. To get a list of files (index) in this library, send the following mail message: To: ndblib@helix.rutgers.edu Subject: send index from newsletter To get a copy of a newsletter itself, send this message, substituting the word by the name of the file containing the desired newsletter, as described in the index: To: ndblib@helix.rutgers.edu Subject: send from newsletter FOR EXAMPLE: The index you receive from the library called newsletter will contain this information about the first newsletter: File Name: Description: ----------------------------------------------------------------- news.dec91 Introductory newsletter, released in December 1991. Therefore, to get a copy of the newsletter, you would send the following message: To: ndblib@helix.rutgers.edu Subject: send news.dec91 from newsletter All files in this library are ascii textfiles, meaning that they can be listed and printed directly. B. reports_ascii: This library contains the ascii text versions of the tables currently available. You can find a description of the contents of the tables at the end of this newsletter. All files in this directory are ascii files, and so they can be observed and printed directly. To get a list of the released tables, their descriptions, and the names of the files containing them, send the following mail message: To: ndblib@helix.rutgers.edu Subject: send index from reports_ascii To get an ascii table itself, send this message, substituting the word by the name of the file containing the desired table, as described in the index: To: ndblib@helix.rutgers.edu Subject: send from reports_ascii NOTE: These ascii files are 130 characters wide, therefore, the lines will be broken up if you view them on an 80 characters wide computer screen. To get a nicer view of the tables, you should use a screen that is at least 130 characters wide. When printing hard copies, care should be taken to print them in a landscape format (the longer side of the paper should be on the top and bottom). C. reports_ps: This library contains the PostScript versions of the tables currently available. You can find a description of the contents of the tables at the end of this newsletter. All files containing tables in this directory are PostScript files. To view these files, you must print them out on a printer that is able to handle PostScript files. To get a list of the released tables, their descriptions, and the names of the files containing them, send the following mail message: To: ndblib@helix.rutgers.edu Subject: send index from reports_ps To get a PostScript table itself, send this message, substituting the word by the name of the file containing the desired table, as described in the index: To: ndblib@helix.rutgers.edu Subject: send from reports_ps Problems or questions concerning the NDBLIB server, should be addressed to: ndbadmin@helix.rutgers.edu 2. ANONYMOUS FTP: The ftp allows you to transfer files directly from the library into your own directory. Following are some basic instructions on how to access the library via ftp. For a complete description of how to use ftp in general, refer to the manual of your particular computer system. Start ftp by typing: ftp helix.rutgers.edu Log in as: anonymous For the password: USE YOUR OWN USERNAME To access NDBLIB, you first have to enter its main directory, called 'pub'. Do this by typing the following command: cd pub Now you have the opportunity to choose one of the three libraries described in the MAIL SERVER instruction section. To do this, you choose one of the libraries (subdirectories) by typing: cd newsletter, cd reports_ascii, or cd reports_ps To obtain a listing of the files in your current library (subdirectory) on the screen, you can type: ls To obtain a complete description of the files in this library (subdirectory), type: get index This will create a file in your directory, called index. To transfer any of the files to your directory, type: get Where is the name of the file, as it appears in the listing and in the index. To exit the ftp program, type: quit THE CURRENT RELEASE: ------------------- This release consists of the following eight tables available in both PostScript and text files in the reports_ps and reports_ascii libraries, respectively. TABLE 1: names This table describes the naming conventions for all structures. It includes the following information: -File name used by Nucleic Acid Database (NDB), -File name used by Protein Data Bank (PDB), -File name used by Cambridge Data Bank (CSD), -Compound Name, -Coordinate status of structure in NDB (Y = available, P = in preparation, * = not available). The table is located in the following files in the library: File name in File name in Contents: reports_ps: reports_ascii: ---------------- -------------- --------------------------------------- names_a.ps names_a.ascii -DNA's and RNA's with structure type A names_b.ps names_b.ascii -DNA's with structure type B (with no drugs bound) names_d.ps names_d.ascii -DNA-DRUG complexes and RNA-DRUG complexes names_g.ps names_g.ascii -DNA-GROOVE-BINDER complexes names_u.ps names_u.ascii -DNA's and RNA's with unusual structure types names_z.ps names_z.ascii -DNA's (and DNA/RNA hybrids) with structure type Z names.ps names.ascii -This file contains all six of the above tables in one. IMPORTANT NOTE: This table should be used in conjunction with all of the other tables. It is your key to identify the structures in the database, according to any convenient naming convention. TABLE 2: citation This table lists the complete citation of the primary references for all structures. The table is located in the following files in the library: File name in File name in Contents: reports_ps: reports_ascii: -------------- ---------------- -------------------------------------- citation_a.ps citation_a.ascii -DNA's and RNA's with structure type A citation_b.ps citation_b.ascii -DNA's with structure type B (with no drugs bound) citation_d.ps citation_d.ascii -DNA-DRUG complexes and RNA-DRUG complexes citation_g.ps citation_g.ascii -DNA-GROOVE-BINDER complexes citation_u.ps citation_u.ascii -DNA's and RNA's with unusual structure types citation_z.ps citation_z.ascii -DNA's (and DNA/RNA hybrids) with structure type Z citation.ps citation.ascii -This file contains all six of the above tables in one. TABLE 3: struct This table includes the following information: -NBD file name of structure, -Sequence of chain, -Structure type (A, B, Z, RH = right handed, U = unusual), -Base modification (yes/no), -Phosphate modification (yes/no), -Mismatch (yes/no), -Drug name (for structures with drugs). The table is located in the following files in the library: File name in File name in Contents: reports_ps: reports_ascii: -------------- ---------------- -------------------------------------- struct.ps struct.ascii -All structures. TABLE 4: drugs This table includes the following information: -NDB file name of structure, -Sequence of the chain, -Chemical name of drug, -Binding type. The table is located in the following files in the library: File name in File name in Contents: reports_ps: reports_ascii: -------------- ---------------- -------------------------------------- drugs.ps drugs.ascii -Structures with drugs Listed in this order: binding type INTERCALATION, binding type MINOR GROOVE BINDER, other binding types. TABLE 5: mismatch This table includes the following information: -NDB file name of structure, -Compound name, -Mismatched residue of the chain 'A', -Mismatched residue of the chain 'B'. The table is located in the following files in the library: File name in File name in Contents: reports_ps: reports_ascii: -------------- ---------------- -------------------------------------- mismatch.ps mismatch.ascii -Structures with mismatches. TABLE 6: basemod This table includes the following information: -NDB file name of structure, -Sequence of chain, -Name of base modifier, -Name of atom to which base modifier is connected, -Residue of the atom to which base modifier is connected, -Name of chain ('A' or 'B'). The table is located in the following files in the library: File name in File name in Contents: reports_ps: reports_ascii: -------------- ---------------- -------------------------------------- basemod.ps basemod.ascii -Structures with base modifications. TABLE 7: phosmod This table includes the following information: -NDB file name of structure, -Sequence of chain, -Name of phosphate modifier, -Name of atom replaced by this modifier, -Residue of the atom replaced, -Name of chain ('A' or 'B'). The table is located in the following files in the library: File name in File name in Contents: reports_ps: reports_ascii: -------------- ---------------- -------------------------------------- phosmod.ps phosmod.ascii -Structures with phosphate modifications. TABLE 8: celldim This table includes the following information: -NDB file name of structure, -Compound name, (not available in text file) -Sequence of the chain, (not available in text file) -Cell dimensions a, b, c, -Cell angles alpha, beta, gamma, -Space group, -Coordinate status in NDB (Y = available, P = in preparation, * = not available). The table is located in the following files in the library: File name in File name in Contents: reports_ps: reports_ascii: -------------- ---------------- -------------------------------------- celldim.ps celldim.ascii -All structures.