OME-Remote Objects
==================
.. figure:: /images/developer-model-pdi.png
:align: left
:width: 60%
:alt:
OMERO is based on the OME data model which can appear overly complex for
new users. However, the core entities you need for getting started are
much simpler.
Images in OMERO are organized into a many-to-many container hierarchy:
"Project" -> "Dataset" -> "Image". These containers (and various other
objects) can be annotated to link various types of data. Annotation
types include Comment (string), Tag (short string), Boolean, Long, Xml,
File attachment etc.
Images are represented as Pixels with 5 dimensions: X, Y, Z, Channel,
Time.
At the core of the work on the `Open Microscopy Environment`_ is the
definition of a vocabulary for working with microscopic data. This
vocabulary has a representation in the :model_doc:`XML specification `,
in the database (the data model), and in code. This last representation
is the object model with which we will concern ourselves here.
Because of its complexity, the object model is generated from a
:model_sourcedir:`central definition <>` using our own
:dsl_plugin_sourcedir:`code-generator <>`. It relies on no libraries
and can be used in both the server and the RMI clients. The
relationships among the objects are enumerated in a cross-referenced
:doc:`reference document `.
:doc:`/developers/server-blitz` uses a :blitz_sourcedir:`second mapping
` to generate |OmeroJava|,
|OmeroPy|, and |OmeroCpp| classes, which can be :blitz_source:`mapped
` back and forth to the
server object model. *This document discusses only the server
object-model and how it is used internally.*
Instances of the object model have no direct interaction with the
database, rather the mapping is handled externally by the O/R framework,
Hibernate_. That means, by and large,
generated classes are data objects, composed only of getter and setter
fields for fields representing columns in the database, and contain no
business logic. However, to make working with the model easier, and
perhaps more powerful, there are several features which we have built in.
.. note::
The discussion here of object types is still relevant but uses
the ome.model.\* objects for examples. These are server internal types
which may lead to some confusion. Clients work with omero.model.\*
objects. This documentation will eventually be updated to reflect both
hierarchies.
OMERO type language
-------------------
The Model has two general parts:
first, the long-studied and well-established core model and second, the
user-specified portion. It is vital that there is a central definition
of both parts of the object model. To allow users to easily define new
types, we need a simple domain specific language (or little language)
which can be mapped to Hibernate mapping files. See an example in :omero_subs_github_repo_root:`omero-model` at:
- :model_source:`src/main/resources/mappings/acquisition.ome.xml`
From this DSL, various artifacts can be generated: XML Schema, Java
classes, SQL for generating tables, etc. The ultimate goal is to have no
exceptions in the model.
.. figure:: /images/model-generation.png
:align: left
:width: 60%
:alt: Code generation steps figure from https://dx.doi.org/10.1038/nmeth.1896
Conceptually, the XSD files under the :file:`components/specification`
source directory are the starting point for all code generation. Currently
however, the files in :omero_subs_github_repo_root:`omero-model` under :model_sourcedir:`src/main/resources/mappings`
are hand-written based on the XSD files.
The task created from the :dsl_plugin_sourcedir:`src` Java files
is then used to turn the mapping files into generated Java code in :omero_subs_github_repo_root:`omero-model` under the
:file:`build/classes/java/main` directory. These classes are all within the
ome.model package. A few hand-written Java classes can also be found in
:model_sourcedir:`src/main/java/ome/model/internal`.
.. warning::
The following paragraph is **NOT** up-to-date. Using :literal:`build-schema` no longer exists in 5.5.0 and has not been replaced yet.
The build-schema ant target takes the generated ome.model classes as
input and generates the :sourcedir:`sql/psql` scripts which get used by
:program:`omero db script` to generate a working OMERO database. Files named
like :file:`OMEROVERSION__PATCH.sql` are hand-written update scripts.
The primary consumer of the ome.model classes at runtime is the
:omero_subs_github_repo_root:`omero-server`.
The above classes are considered the internal server code, and are the only
objects which can take part in Hibernate transactions.
External to the server code is the :omero_subs_github_repo_root:`omero-blitz` layer. These classes are in the
omero.model package. They are generated by another call to the DSL task
in order to generate the Java, Python, C++, and Ice files under, by default,
:file:`build/psql/`.
In :omero_subs_github_repo_root:`omero-blitz`, the generated Ice files along with the hand-written Ice files from
:blitz_sourcedir:`src/main/slice/omero` are then run through the
``slice2cpp``, ``slice2java``, and ``slice2py`` command-line utilities in
order to generate source code in each of these languages. Clients pass in
instances of these omero.model (or in the case of C++, omero::model) objects.
These are transformed to ome.model objects, and then persisted to the
database.
If we take a concrete example, a C++ client might create an Image via new
``omero::model::ImageI()``. The "I" suffix represents an "implementation" in
the Ice naming scheme and this subclasses from omero::model::Image. This can
be remotely passed to the server which will be deserialized as an
omero.model.ImageI object. This will then get converted to an
ome.model.core.Image, which can finally be persisted to the database.
Keywords
^^^^^^^^
Some words are not allowed as properties/fields of OMERO types. These
include:
- id
- version
- details
- … any SQL keyword
Improving generated data objects
--------------------------------
Constructors
^^^^^^^^^^^^
Two special constructors are generated for each model object. One is for
creating proxy instances, and the other is for filling all NOT-NULL
fields:
::
Pixels p_proxy = new Pixels(Long, boolean);
Pixels p_filled = new Pixels(ome.model.core.Image, ome.model.enums.PixelsType,
java.lang.Integer, java.lang.Integer, java.lang.Integer, java.lang.Integer, java.lang.Integer,
java.lang.String, ome.model.enums.DimensionOrder, ome.model.core.PixelsDimensions);
The first should almost always be used as: ``new Pixels(5L, false)``.
Passing in an argument of ``true`` would imply that this object is
actually loaded, and therefore the server would attempt to null all the
fields on your object. See below for a discussion on loadedness.
In the special case of Enumerations, a constructor is generated
which takes the ``value`` field for the enumeration:
::
Format file_format = new Format("text/plain");
Further, this is the only example of a managed object which will be
loaded by the server **without** its id. This allows applications to
record only the string ``"text/plain"`` and not need to know the actual id
value for ``"text/plain"``.
.. _model details property:
Details
^^^^^^^
Each table in the database has several columns handling low-level
matters such as security, ownership, and provenance. To hide some of
these details in the object model, each IObject instance contains an
ome.model.internal.Details instance.
Details works something like unix's ``stat``:
::
/Types/Images>ls -ltrAG
total 0
-rw------- 1 josh 0 2006-01-25 20:40 Image1
-rw------- 1 josh 0 2006-01-25 20:40 Image2
-rw------- 1 josh 0 2006-01-25 20:40 Image3
-rw-r--r-- 1 josh 0 2006-01-25 20:40 Image100
/Types/Images>stat Image1
File: `Image1'
Size: 0 Blocks: 0 IO Block: 4096 regular empty file
Device: 1602h/5634d Inode: 376221 Links: 1
Access: (0600/-rw-------) Uid: ( 1003/ josh) Gid: ( 1001/ ome)
Access: 2006-01-25 20:40:30.000000000 +0100
Modify: 2006-01-25 20:40:30.000000000 +0100
Change: 2006-01-25 20:40:30.000000000 +0100
though it can also store arbitrary other attributes (meta-metadata, so
to speak) about our model instances. See :ref:`Model#dynamic` below for more
information.
The main methods on Details are:
::
Permissions Details.getPermissions();
List Details.getUpdates();
Event Details.getCreationEvent();
Event Details.getUpdateEvent();
Experimenter Details.getOwner();
ExperimenterGroup Details.getGroup();
ExternalInfo getExternalInfo();
though some of the methods will return :literal:`null`, if that column is not
available for the given object. See :ref:`Model#Interfaces` below for more
information.
Consumers of the API are encouraged to pass around Details instances
rather than specifying particulars, like:
::
if (securitySystem.allowLoad(Project.class, project.getDetails())) {}
// and not
if (project.getDetails().getPermissions().isGranted(USER,READ) && project.getDetails().getOwner().getId( myId )) {…}
This should hopefully save a good deal of re-coding if we move to true
ACL rather than the current filesystem-like access control.
Because it is a field on every type, Details is also on the list of
keywords in the type language (above).
.. _Model#Interfaces:
Interfaces
^^^^^^^^^^
To help work with the generated objects, several interfaces are added to
their "implements" clause:
+------------------------+---------------------------+--------------+-------------+
| Property | Applies\_to | Interface | Notes |
+------------------------+---------------------------+--------------+-------------+
| Base | |
+------------------------+---------------------------+--------------+-------------+
| owner | ! global | | need sudo |
+------------------------+---------------------------+--------------+-------------+
| group | ! global | | need sudo |
+------------------------+---------------------------+--------------+-------------+
| version | ! immutable | | |
+------------------------+---------------------------+--------------+-------------+
| creationEvent | ! global | | |
+------------------------+---------------------------+--------------+-------------+
| updateEvent | ! global && ! immutable | | |
+------------------------+---------------------------+--------------+-------------+
| permissions | | | |
+------------------------+---------------------------+--------------+-------------+
| externalInfo | | | |
+------------------------+---------------------------+--------------+-------------+
| Other | |
+------------------------+---------------------------+--------------+-------------+
| name | | Named | |
+------------------------+---------------------------+--------------+-------------+
| description | | Described | |
+------------------------+---------------------------+--------------+-------------+
| linkedAnnotationList | | IAnnotated | |
+------------------------+---------------------------+--------------+-------------+
For example, ``ome.model.meta.Experimenter`` is a "global" type,
therefore it has no ``Details.owner`` field. In order to create this
type of object, you will either need to have admin privileges, or in
some cases, use the ``ome.api.IAdmin`` interface directly (in the case
of enums, you will need to use the ``ome.api.ITypes`` interface).
.. _Model#Inheritance:
Inheritance
^^^^^^^^^^^
Inheritance is supported in the object model. The superclass
relationships can be defined simply in the mapping files. One example in :omero_subs_github_repo_root:`omero-model` is
the annotation hierarchy in
:model_source:`src/main/resources/mappings/annotations.ome.xml`.
Hibernate supports this polymorphism, and will search all subclasses
when a superclass is returned. *However*, due to Hibernate's use of
bytecode-generated proxies, testing for class equality is not always
straightforward.
Hibernate uses CGLIB and Javassist and similar bytecode generation to
perform much of its magic. For these bytecode generated objects, the
``getClass()`` method returns something of the form
``ome.model.core.Image\_$$\_javassist`` which cannot be passed back into
Hibernate. Instead, we must first parse that class String with
:model_source:`Utils#trueClass() `.
Model report objects
^^^^^^^^^^^^^^^^^^^^
To support the Collection Counts
requirement in which users would like to know how many objects are in a
collection by owner, it was necessary to add read-only
``Map`` fields to all objects with links. See the
:doc:`/developers/Server/CollectionCounts` page for more information.
.. _Model#dynamic:
Dynamic methods
^^^^^^^^^^^^^^^
Finally, because not all programming fits into the static programming
frame, the object model provides several methods for working dynamically
with all IObject subclasses.
fieldSet / putAt / retrieve
"""""""""""""""""""""""""""
Each model class contains a public final static String for each field in
that class (superclass fields are omitted). A copy of all these fields
is available through ``fieldSet()``. This field identifier can be used in
combination with the putAt and retrieve methods to store arbitrary data
in a class instance. Calls to ``putAt / retrieve`` with a string found in
fieldSet delegate to the traditional getters/setters. Otherwise, the
value is stored in lazily-initialized Map (if no data is stored, the
map is :literal:`null`).
acceptFilter
""""""""""""
An automation of calls to ``putAt / retrieve`` can be achieved by
implementing an ome.util.Filter. A Filter is a VisitorPattern-like
interface which not only visits every field of an object, but also has
the chance to replace the field value with an arbitrary other value.
Much of the internal functionality in OMERO is achieved through filters.
Limitations
"""""""""""
- The filter methods override all standard checks such as
IObject#isLoaded and so null-pointer exceptions etc. may be thrown.
- The types stored in the dynamic map currently do not propagate to the
:doc:`/developers/server-blitz` model objects, since not all
java.lang.Objects can be converted.
Entity lifecycle
----------------
These additions make certain operations on the model objects easier and
cleaner, but they do not save the developer from understanding how each
object interacts with Hibernate. Each object has a defined lifecycle and
it is important to know both the origin (client, server, or backend) as
well as its current state to understand what will and can happen with
it.
States
^^^^^^
Each instance can be found in one of several states. Quickly, they are:
**transient**
The entity has been created (``"new Image()"``) and not yet shown to the
backend.
**persistent**
The entity has been stored in the database and has a non-:literal:`null` id
(``IObject.getId()``). Here Hibernate differentiates between detached,
managed, and deleted entities. Detached entities do not take part in
lazy-loading or dirty detection like managed entities do. They can,
however, be re-attached (made "managed"). Deleted entities cannot
take part in most of the ORM activities, and exceptions will be
thrown if they are encountered.
**unloaded** (a reference, or proxy)
To solve the common problem of lazy loading exceptions found in many
Hibernate applications, we have introduced the concept of unloaded
proxy objects which are objects with all fields nulled other than
the id. Attempts to get or set any other property will result in an
exception. The backend detects these proxies and restores their
value before operating on the graph. There are two related states
for collections -- :literal:`null` which is completely unloaded, and
filtered in which certain items have been removed (more on this
below).
.. figure:: /images/ObjectStates.png
:align: center
:alt: Object states
Identity, references, and versions
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Critical for understanding these states is understanding the concepts of
identity and versioning as it relates to ORM. Every object has an id
field that if created by the backend will not be :literal:`null`. However,
every table does not have a primary key field -- subclasses contain a foreign
key link to their superclass. Therefore all objects without an id are
assumed to be non-persistent (i.e. transient).
Though the id cannot be the sole decider of equality since there are issues
with the Java definition of equals() and hashCode(), we often perform lookups
based on the class and id of an instance. Here again caution must be
taken not to unintentionally use a possibly bytecode-generated subclass. See
the discussion under :ref:`Model#Inheritance` above.
Class/id-based lookup is in fact so useful that it is possible to take
an model object and call ``obj.unload()`` to have a "reference" --
essentially a placeholder for a model object that contains only an id.
Calls to any accessors other than get/setId will throw an exception. An
object can be tested for loadedness with ``obj.isLoaded()``.
A client can use unloaded instances to inform the backend that a certain
information is not available and should be filled in server-side. For
example, a user can do the following:
::
Project p = new Project();
Dataset d = new Dataset( new Long(1), false); // this means create an already unloaded instance
p.linkDataset(d);
iUpdate.saveObject(p);
The server, in turn, also uses references to replace backend proxies
that would otherwise throw ``LazyInitializationException``\ s on
serialization. Clients, therefore, must code with the expectation that
the leaves in an object graph may be unloaded. Extending a query with
"outer join fetch" will cause these objects to be loaded as well. For
example:
::
select p from Project p
left outer join fetch p.datasetLinks as links
left outer join fetch links.child as dataset
but eventually in the complex OME metadata graph, it is certain that
something will remain unloaded.
Versions are the last piece to understanding object identity. Two
entities with the same id should not be considered equal if they have
differing versions. On each write operation, the version of an entity
is incremented. This allows us to perform optimistic locking so that two
users do not simultaneously edit the same object. That works so:
#. User A and User B retrieve Object X id=1, version=0.
#. User A edits Object X and saves it. Version is incremented to 1.
#. User B edits Object X and tries to save it. The SQL generated is:
UPDATE table SET value = newvalue WHERE id = 1 and version = 0; which
upates no rows.
#. The fact that no rows were altered is seen by the backend and an
:literal:`OptimisticLockException` is thrown.
Identity and versioning make working with the object model difficult
sometimes, but guarantee that our data is never corrupted.
Working with the object model
-----------------------------
With these states in mind, it is possible to start looking at how to
actually use model objects. From the point of view of the server,
everything is either an assertion of an object graph (a "write") or a
request for an object graph (a "read"), whether they are coming from an
RMI client, an :doc:`server-blitz` client, or even being generated internally.
Writing
^^^^^^^
Creating new objects is as simple as instantiating objects and linking
them together. If all NOT-NULL fields are not filled, then a
``ValidationException`` will be thrown by the server:
::
IUpdate update = new ServiceFactory().getUpdateService();
Image i = new Image();
try {
update.saveObject(i);
catch (ValidationException ve) {
// not ok.
}
i.setName("image");
return update.saveAndReturnObject(i); // ok.
Otherwise, the returned value will be the Image with its id field filled. This
works on arbitrarily complex graphs of objects:
::
Image i = new Image("image-name"); // This constructor exists because "name" is the only required field.
Dataset d = new Dataset("dataset-name");
TagAnnotation tag = new TagAnnotation();
tag.setTextValue("some-tag");
i.linkDataset(d);
i.linkAnnotation(tag);
update.saveAndReturnObject(i);
Reading
^^^^^^^
Reading is a similarly straightforward operation. From a simple id-based
lookup, ``iQuery.get(Experimenter.class, 1L)`` to a search for an
arbitrarily complex graph:
::
Image i = iQuery.findByQuery("select i from Image i "+
"join fetch i.datasetLinks as dlinks "+
"join fetch i.annotationLinks as alinks "+
"join fetch i.details.owner as owner "+
"join fetch owner.details.creationEvent "+
"where i.id = :id", new Parameters().addId(1L));
In the return graph, you are guaranteed that any two instances of the
same class with the same id are the same object. For example:
::
Image i = …; // From query
Dataset d = i.linkedDatasetList().get(0);
Image i2 = d.linkedImageList().get(0);
if (i.getId().equals(i2.getId()) {
assert i == i2 : "Instances must be referentially equal";
}
Reading and writing
^^^^^^^^^^^^^^^^^^^
Complications arise when you try to mix objects from different read
operations because of the difference in equality. In all but the most
straightforward applications, references to :literal:`IObject` instances from
different return graphs will start to intermingle. For example, when a
user logins in, you might query for all Projects belonging to the user:
::
List projects = iQuery.findAllByQuery("select p from Project p where p.details.owner.omeName = someUser", null);
Project p = projects.get(0);
Long id = p.getId();
Later you might query for Datasets, and be returned some of the same
Projects again within the same graph. You have now possibly got two
versions of the Project with a given id within your application. And if
one of those Projects has a new Dataset reference, then Hibernate would
not know whether the object should be removed or not.
::
Project oldProject = …; // Acquired from first query
// Do some other work
Dataset dataset = iQuery.findByQuery("select d from Dataset d "+
"join fetch d.projectsLinks links "+
"join fetch links.parent "+
"where d.id = :id", new Parameters().addId(5L));
Project newProject = dataset.linkedProjectList().get(0);
assert newProject.getId().equals(oldProject.getId()) : "same object";
assert newProject.sizeOfDatasetLinks() == oldProject.sizeOfDatasetLinks() :
"if this is false, then saving oldProject is a problem";
Without optimistic locks, trying to save oldProject
would cause whatever Datasets were missing from it to be removed from
newProject as well. Instead, an ``OptimisticLockException`` is thrown
if a user tries to change an older reference to an entity. Similar
problems also arise in multi-user settings, when two users try to access
the same object, but it is not purely due to multiple users or even
multiple threads, but simply due to stale state.
.. note::
There is an issue with multiple users in which a
``SecurityViolation`` is thrown instead of an ``OptimisticLockException``.
Various techniques to help to manage these duplications are:
- Copy all data to your own model.
- Return unloaded objects wherever possible.
- Be very careful about the operations you commit and about the order
they take place in.
- Use a :literal:`ClientSession`.
Lazy loading
^^^^^^^^^^^^
An issue related to identity is lazy loading. When an object graph is
requested, Hibernate loads only the objects which are directly
requested. All others are replaced with proxy objects. Within the
Hibernate session, these objects are "active" and if accessed, they
will be automatically loaded. This is taken care of by the first-level
cache, and is also the reason that referential equality is guaranteed
within the Hibernate session. Outside of the session however, the
proxies can no longer be loaded and so they cannot be serialized to the
client.
Instead, as the return value passes through OMERO's AOP layer, they get
disconnected. Single-valued fields are replaced by an unloaded version:
::
OriginalFile ofile = …; // Object to test
if ( ! Hibernate.isInitialized( ofile.getFormat() ) {
ofile.setFormat( new Format( ofile.getFormat().getId(), false) );
}
Multi-valued fields, or collections, are simply nulled. In this case,
the :literal:`sizeOfXXX` method will return a value less than zero:
::
Dataset d = …; // Dataset obtained from a query. Didn't request Projects
assert d.sizeOfProjects() < 0 : "Projects should not be loaded";
This is why it is necessary to specify all "join fetch" clauses for
instances which are required on the client-side. See
:server_source:`ProxyCleanupFilter `
for the implementation.
Collections
^^^^^^^^^^^
More than just the nulling during serialization, collections pose
several interesting problems.
For example, a collection may be filtered on retrieval:
::
Dataset d = iQuery.findByQuery("select d from Dataset d "+
"join fetch d.projectLinks links "+
"where links.parent.id > 2000", null);
Some ``ProjectDatasetLink`` instances have been filtered from the
projectLinks collection. If the client decides to save this collection
back, there is no way to know that it is incomplete, and Hibernate will
remove the missing Projects from the Dataset. It is the developer's
responsibility to know what state a collection is in. In the case of
links, discussed below, one solution is to use the link objects
directly, even if they are largely hidden with the API, but the problem
remains for 1-N collections.
.. _Model#Links:
Links
^^^^^
A special form of links collection model the many-to-many
relationship between two other objects. A Project can contain any number
of Datasets, and a Dataset can be in any number of Projects. This is
achieved by ``ProjectDatasetLinks``, which have a Project "parent" and a
Dataset "child" (the parent/child terms are somewhat arbitrary but are
intended to fit roughly with the users' expectations for those types).
It is possible to both add and remove a link directly:
::
ProjectDatasetLink link = new ProjectDatasetLink();
link.setParent( someProject );
link.setChild( someDataset );
link = update.saveAndReturnObject( link );
// someDataset is now included in someProject
update.deleteObject(link);
// or update.deleteObject(new ProjectDatasetLink(link.getId(), false)); // a proxy
// Now the Dataset is not included,
// __unless__ there was already another link.
However, it is also possible to have the links managed for you:
::
someProject.linkDataset( someDataset ); // This creates the link
update.saveObject( someProject ); // Notices added link, and saves it
someProject.unlinkDataset( someDataset );
update.saveObject( someProject ); // Notices removal, and deletes it
The difficulty with this approach is that ``unlinkDataset()`` will fail
if the someDataset which you are trying to remove is not referentially
equal. That is:
::
someProject.linkDataset( someDataset );
updatedProject = update.saveAndReturnObject( someProject );
updatedProject.unlinkDataset( someDataset );
update.saveObject( updateProject ); // will do __nothing__ !
does not work since someDataset is not included in updatedProject, but
rather updatedDataset with the same id is. Therefore, it would be
necessary to do something along the following lines of:
::
updatedProject = …; // As before
for (Dataset updatedDataset : updatedProject.linkedDatasetList() ) {
if (updatedDataset.getId().equals( someDataset.getId() )) {
updatedProject.unlinkDataset( updatedDataset );
}
}
The unlink method in this case, removes the link from both the
Project.datasetLinks collection as well as from the Dataset.projectLinks
collection. Hibernate notices that both collections are in agreement,
and deletes the ProjectDatasetLink (this is achieved via the
"delete-orphan" annotation in Hibernate). If only one side of the
collection has had its link removed, an exception will be thrown.
Synchronization
^^^^^^^^^^^^^^^
Another important point is that the model objects are in no way
synchronized. All synchronization must occur within application code.
Limitations
-----------
We try to minimize differences between the Model as described by the XML
specification and its implementation in the OMERO database but some Objects
may behave in a more restricted fashion within OMERO. Examples include:
- ROIs and rendering settings can only belong to one Image