How DART stores observations: observation sequence (obs_seq) files
==================================================================

Since DART is designed to assimilate observations from any data source, it
includes a set of programs to convert observations from their original
format to DART's own observation sequence, or ``obs_seq``, format. The
``obs_seq`` format is designed to allow DART to accomodate a myriad of source
observation file formats, structure and metadata. Many original source
observation files don't contain the necessary information about the error
characteristics and spatial structure of the data needed to perform an
assimilation.

There are three types of ``obs_seq`` files.

obs_seq.in
----------

An ``obs_seq.in`` file actually contains no observation quantities. It may be
best thought of as a perfectly laid-out notebook waiting for an observer
to fill in the actual observation quantities.

All the rows and columns are ready, labelled, and repeated for every
observation time and platform. The ``obs_seq.in`` file is generally the start
of a "perfect model" experiment.

In a perfect model experiment, one instance of the model is run through the 
DART program ``perfect_model_obs`` - which applies the appropriate forward
operators to the model state and writes down the observations generated by the 
model in the writes them down in the perfectly laid-out notebook.

The completed notebook is then renamed ``obs_seq.out``.

obs_seq.out
-----------

An ``obs_seq.out`` file contains a linked list of observations. The
observations can potentially be (and usually are) from different platforms and
of different quantities, each with their own error characteristics and
metadata.

An ``obs_seq.out`` file containing real data can be generated by using one of
DART's many observation converter programs. Additionally, an ``obs_seq.out``
file containing synthetic data can be created by running DART's
``perfect_model_obs`` program.

The observations in the ``obs_seq.out`` files are assimilated into the model
ensemble by DART's ``filter`` program.

To learn more about the structure of the ``obs_seq.out`` file, see
:doc:`detailed-structure-obs-seq`.

If you want to create an observation sequence file from real observations, you
should contact DAReS staff by emailing dart@ucar.edu for advice regarding your
specific types of observations.

obs_seq.final
-------------

When running an assimilation, DART's ``filter`` program assimilates the
observations contained in the ``obs_seq.out`` file and generates an 
``obs_seq.final`` file.

The ``obs_seq.final`` file contains everything in the ``obs_seq.out`` file and
also contains a few additional 'copies' of the observation.

Since DART is an ensemble algorithm, each ensemble member must compute its own
estimate of the observation for the algorithm. You can save the ensemble
members' estimates of the observation in the ``obs_seq.final`` file by setting
the ``num_output_obs_members`` entry in the ``filter_nml`` namelist of
``input.nml`` to a value greater than zero.

Minimally, ``filter`` will record the mean and spread of the ensemble estimates
in the ``obs_seq.final`` file.

To learn more about the structure of the ``obs_seq.final`` file, see
:doc:`detailed-structure-obs-seq`.

Using obs_seq.final for observation-space diagnostics
-----------------------------------------------------

The best method to determine the performance of an experiment in which you
assimilate data from real-world sources is to compare the ensemble estimates of
the observation to your real-world data. You can estimate the bias and error of
the ensemble mean or gauge how many of the real-world observations are actually
being assimilated. These diagnostics are known as observation-space
diagnostics.

DART provides :doc:`programs obs_diag <../assimilation_code/programs/readme>`
and :doc:`matlab-observation-space` for you use to quickly assess the
performance of your experiment.

.. note::

   Since each 'observation type' may require different amounts of metadata to
   be read or written, any routine to read or write an observation sequence
   **must** be compiled with support for those particular observations. The
   supported observations are listed in the ``obs_kind_nml`` namelist of
   ``input.nml``. For more information, see :doc:`preprocess-program`.