The basic element of information in a mmCIF data file is an individual data item such as a structure factor. Collections of related data items may be grouped together in subcategories. For instance, in the mmCIF dictionary the x, y, and z cartesian components are assigned to the cartesian_coordinate subcategory.
Data items may also be organized into categories. A category is a logical association among a group of data items requiring that the value(s) of one or more of the items in the group can be used to distinguish between different instances of the group. A category has many of the properties of a table in a relational database. For example, the mmCIF category atom_site, which holds the table of atomic positions, uses the data item atom_id as the unique identifier or key for each atomic position.
In organizing the data definitions in the mmCIF dictionary into categories, it was found that many different categories shared common sets of unique identifiers. For instance, the definition of protein secondary structure includes the residue labels that define the limits of each structural feature. These residue labels also occur in the definition of the molecular sequence and in the definition of each atomic position. In some cases, it is important that a secondary structure description references only those residue labels for which positions have been determined. The fact that the residue labels, although used in these different contexts, refer to the same item of information is defined within the mmCIF dictionary by specifying a parent/child relationships between the residue labels items in the different categories. Because parent/child relationships are prevalent at all levels of macromolecular structure description, it is an important feature of the mmCIF data description that the relationships between common data items in different categories is precisely described.
Collections of related categories in the mmCIF dictionary are organized into category groups. Category groups provide a mechanism for expressing associations among categories in the same manner as chapters organize related sections in books. For example, in the mmCIF dictionary all of the categories pertaining to refinement are assigned to a category group named refine_group.
The highest level of organization provided by mmCIF is the data block. Each data block acts like an independent database. A data block is used to contain the mmCIF dictionary, and it may be used to hold all of the information pertaining to a particular structural experiment.