source: branches/4.1/components/bio-formats/doc/using-bioformats.txt @ 5727

Revision 5727, 7.8 KB checked in by melissa, 11 years ago (diff)

Updated documentation and what's new for today's release.

Line 
1                   Using Bio-Formats Guide by Melissa Linkert
2
3                                  Overview
4                                 ----------
5
6This document describes various things that are useful to know when working
7with Bio-Formats.  It is recommended that you obtain the Bio-Formats source
8by following the directions at http://www.loci.wisc.edu/software, rather than
9using an official release.  It is also recommended that you have a copy of the
10JavaDocs nearby (available online at
11http://hudson.openmicroscopy.org.uk/job/LOCI/javadoc/);
12the notes that follow will make more sense when you see the API.
13
14For a complete list of supported formats, see the Bio-Formats home page:
15http://www.loci.wisc.edu/ome/formats.html
16
17                              Basic File Reading
18                             --------------------
19
20Bio-Formats provides several methods for retrieving data from files in an
21arbitrary (supported) format.  These methods fall into three categories: raw
22pixels, core metadata, and format-specific metadata.  All methods described here
23are present and documented in loci.formats.IFormatReader - it is advised that
24you take a look at the source and/or JavaDoc.  In general, it is recommended
25that you read files using an instance of ImageReader.  While it is possible to
26work with readers for a specific format, ImageReader contains additional logic
27to automatically detect the format of a file and delegate subsequent calls to
28the appropriate reader.
29
30Prior to retrieving pixels or metadata, it is necessary to call setId(String)
31on the reader instance, passing in the name of the file to read.  Some formats
32allow multiple series (5D image stacks) per file; in this case you may wish to
33call setSeries(int) to change which series is being read.
34
35Raw pixels are always retrieved one plane at a time.  Planes are returned
36as raw byte arrays, using one of the openBytes methods.
37
38Core metadata is the general term for anything that might be needed to work with
39the planes in a file.  A list of core metadata fields is given below, with the
40appropriate accessor method in parentheses:
41
42- image width (getSizeX())
43- image height (getSizeY())
44- number of series per file (getSeriesCount())
45- total number of images per series (getImageCount())
46- number of slices in the current series (getSizeZ())
47- number of timepoints in the current series (getSizeT())
48- number of actual channels in the current series (getSizeC())
49- number of channels per image (getRGBChannelCount())
50- the ordering of the images within the current series (getDimensionOrder())
51- whether each image is RGB (isRGB())
52- whether the pixel bytes are in little-endian order (isLittleEndian())
53- whether the channels in an image are interleaved (isInterleaved())
54- the type of pixel data in this file (getPixelType())
55
56All file formats are guaranteed to accurately report core metadata.
57
58Format-specific metadata refers to any other data specified in the file - this
59includes acquisition and hardware parameters, among other things.  This data
60is stored internally in a java.util.Hashtable, and can be accessed in one of
61two ways: individual values can be retrieved by calling
62getMetadataValue(String), which gets the value of the specified key.
63Alternatively, getMetadata() will return the entire Hashtable.
64Note that the keys in this Hashtable are different for each format, hence the
65name "format-specific metadata".
66
67See the Bio-Formats Metadata Guide for more information on the metadata
68capabilities that Bio-Formats provides.
69
70                             File Reading Extras
71                            ---------------------
72
73The previous section described how to read pixels as they are stored in the
74file.  However, the native format isn't necessarily convenient, so Bio-Formats
75provides a few extras to make file reading more flexible.
76
77- There are a few "wrapper" readers (that implement IFormatReader) that take a
78  reader in the constructor, and manipulate the results somehow, for
79  convenience. Using them is similar to the java.io InputStream/OutputStream
80  model: just layer whichever functionality you need by nesting the wrappers.
81  + BufferedImageReader extends IFormatReader, and allows pixel data to be
82    returned as BufferedImages instead of raw byte arrays.
83  + FileStitcher extends IFormatReader, and uses advanced pattern
84    matching heuristics to group files that belong to the same dataset.
85  + ChannelSeparator extends IFormatReader, and makes sure that
86    all planes are grayscale - RGB images are split into 3 separate grayscale
87    images.
88  + ChannelMerger extends IFormatReader, and merges grayscale
89    images to RGB if the number of channels is greater than 1.
90  + ChannelFiller extends IFormatReader, and converts indexed color images to
91    RGB images.
92  + MinMaxCalculator extends IFormatReader, and provides an API
93    for retrieving the minimum and maximum pixel values for each channel.
94  + DimensionSwapper extends IFormatReader, and provides an API
95    for changing the dimension order of a file.
96- ImageTools and loci.formats.gui.AWTImageTools provide a number of methods for
97  manipulating BufferedImages and primitive type arrays.  In particular, there
98  are methods to split and merge channels in a BufferedImage/array, as well as
99  converting to a specific data type (e.g. convert short data to byte data).
100
101                                Writing Files
102                               ---------------
103
104The following file formats can be written using Bio-Formats:
105
106- TIFF (uncompressed or LZW)
107- OME-TIFF (uncompressed or LZW)
108- JPEG
109- PNG
110- AVI (uncompressed)
111- QuickTime (uncompressed is supported natively; additional codecs use QTJava)
112- Encapsulated PostScript (EPS)
113
114We are planning support for OME-XML in the near future.
115
116The writer API (see loci.formats.IFormatWriter) is very similar to the reader
117API, in that files are written one plane at time (rather than all at once).
118
119All writers allow the output file to be changed before the last plane has
120been written.  This allows you to write to any number of output files using
121the same writer and output settings (compression, frames per second, etc.),
122and is especially useful for formats that do not support multiple images per
123file.
124
125Please see the Movie Stitcher (loci.apps.stitcher) for an example of how
126to write files using Bio-Formats.
127
128                    Arcane Notes and Implementation Details
129                   -----------------------------------------
130
131Following is a list of known oddities.
132
133o Importing multi-file formats (Leica LEI, PerkinElmer, FV1000 OIF, ICS, and
134  Prairie TIFF) can fail if any of the files are renamed.  There are
135  "best guess" heuristics in these readers, but they aren't guaranteed to work
136  in general.  So please don't rename files in these formats.
137
138o If you are working on a Macintosh, make sure that the data and resource forks
139  of your image files are stored together.  Bio-Formats does not handle
140  separated forks (the native QuickTime reader tries, but usually fails).
141
142o Through specialized I/O classes, Bio-Formats is able to control the number of
143  open file descriptors (in the current JVM).  Currently, the maximum is 200,
144  which is lower than the default on most systems.  Side note on I/O: the
145  reasoning behind writing our own I/O stuff (see
146  loci.common.RandomAccessInputStream) is 1) InputStreams are fast at reading
147  data sequentially, but cannot do random access; 2) RandomAccessFiles are
148  great for random access, but less efficient for sequential reading; 3) we
149  needed RandomAccessFile-like functionality for byte arrays; 4) we wanted to
150  be able to read from disk, over HTTP, and potentially other sources.  The
151  result is a hybrid class that extends InputStream and implements DataInput to
152  meet all of our goals.
153
154o RLE-compressed QuickTime movies will look funny if the planes are not read
155  in sequential order, since proper decoding of a particular plane can depend
156  on the previous plane.
Note: See TracBrowser for help on using the repository browser.