Version: 3.5.0

org.generationcp.core.datasource
Interface DataSource

All Superinterfaces:
DataComponent, Identification
All Known Subinterfaces:
WritableDataSource
All Known Implementing Classes:
AbstractDataSource

public interface DataSource
extends DataComponent

A data source represents any source of the GCP data - it can be a database, or a collection of Biomoby services, or a flat-file, or anything else. Each such data source is presented in a system by a class that implements this DataSource interface.

Each DataSource is a simple abstraction allowing to access both metadata about a data source and a data from a data source:

Metadata characterise the data source. It includes:

Data are the core of any data source:

Data sources deal with data types and data type attributes (usually, of the objects defined in the GCP Domain model). Providers and users of data sources identify these types by unique identifiers (that come from GCP ontology). These unique identifiers are then used in many DataSource methods, for example:

Another important concept worth to mention is that the attributes of what a data source returns back does not need to be the same as attributes used in the search criteria to select what is returned back. For example, a data source can return crops location information (set of latitudes and longitudes) based on a search criteria country_name EQUAL China.

An obvious question is what should be represented by a data source. Should a data source represent a data base, or should there be more data sources representing main parts of a database? There is no obvious answer for that. This interface is sufficient to cover both cases (because its methods can tell how many and which data types can be provided).

Also, one physical data resouce (like a database) can be represented by two (or more) different data sources - each of them accessing the same database using different protocol. An example would be two data sources, one of them accessing directly a local database, using JDBC, and one of them accessing the same database using Web Services. Even thought these two data source are accessing the same database, they may provide different range of searchable attributes (perhaps because the one based on Web Services cannot understand more sophisticated search criteria).

Version:
$Id: DataSource.html 19678 2010-07-23 23:41:44Z jbmorales $
Author:
Martin Senger
See Also:
A tutorial: How to create and use Data Sources

Method Summary
 java.util.List<java.lang.Object> find(java.lang.String dataTypeIdentifier, SearchFilter[] filters, java.lang.String[] includedAttributesIdentifiers, java.util.Map<java.lang.String,java.lang.Object> options)
          Retrieve data from this data source.
 DataType getDataType(java.lang.String dataTypeIdentifier)
          Check if this data source can provide a given data type.
 DataType[] getDataTypes()
          Return all data types that this data source can provide (can return by the find()) method).
 java.util.Map getSupportedOptions(java.lang.String dataTypeIdentifier)
          Options are additional features of this data source.
 
Methods inherited from interface org.generationcp.core.DataComponent
getMetadata
 
Methods inherited from interface org.generationcp.core.Identification
getClassification, getDescription, getName, getUniqueIdentifier
 

Method Detail

getDataTypes

DataType[] getDataTypes()
Return all data types that this data source can provide (can return by the find()) method). The returned types may represent also data attributes. For example, a data source may return full Germplasm objects, or just a list of strings representing Germplasm's ID.

This method is useful, for example, when you are building a GUI and you wish to display types of all possible search results - so the users can select which result type they are interested in.

Returns:
an empty list (not null) if this data source does not provide anything (which should not be a usual case, of course). Otherwise, it returns an identification of data types or data attribute types available from this data source.

See Also:
getDataType()

getDataType

DataType getDataType(java.lang.String dataTypeIdentifier)
Check if this data source can provide a given data type.

Parameters:
dataTypeIdentifier - corresponds to the DataType.getUniqueIdentifier; it identifies a data type a caller is asking about

Returns:
a full DataType whose identifier is equal to 'dataTypeIdentifier', or null if such data type cannot be provided by this data source

See Also:
getDataTypes()

getSupportedOptions

java.util.Map getSupportedOptions(java.lang.String dataTypeIdentifier)
Options are additional features of this data source. This method shows what options are available (supported) for a particular returned type.

Each data source can provide more than one returned type - and each of them can support different set of options. Therefore, this method has a parameter indicating which return type the supported options belong to. For the names of possible options see Option ontology, but here are details about wildly recognized options:

Name/KeyType of valueDefault valueDescription
SORTED_BY String[] no sorting Value is an array of unique identifiers of the data type attributes that can be used as sorting keys. More elements in the array, more sorting keys.

All other options related to sorting are used only if this option has a non empty value.

SORT_ORDER Boolean[]
or
Boolean
sort ascending If used conjunction with SORTED_BY the value is an array of Booleans that indicate the sort order of data type attributes used as sorting keys (as defined in the SORTED_BY option): true indicates order ascending and false indicates order descending.

If it is not used inconjunction with SORTED_BY the value will be of type Boolean and determine the sort order of the returned objects themselves. Again true indicates order ascending and false indicates order descending.

DISTINCT Boolean false If true than the returned list will be actually a set, meaning that there will not be duplicated objects. Of course, the objects themselves define what they consider to be an equality.

Parameters:
dataTypeIdentifier - identifies a return type (returned by the find() method) for which this method returns supported options. It is the same string as used in the first parameter of the find() method.

Returns:
returns a list of supported options for the given type. If there are no supported options it returns an empty Map (not null).


find

java.util.List<java.lang.Object> find(java.lang.String dataTypeIdentifier,
                                      SearchFilter[] filters,
                                      java.lang.String[] includedAttributesIdentifiers,
                                      java.util.Map<java.lang.String,java.lang.Object> options)
                                      throws java.lang.IllegalArgumentException,
                                             GCPException
Retrieve data from this data source.

This is the most important method of this interface - it provides a set of real data provided by this data source.

Parameters:
dataTypeIdentifier - identifies a data type whose instances should be returned back. Of course, it makes sense to specify here only one of the data types which were returned by the method getDataTypes(), or a data type which was returned by getDataType(). Note that the content of the returned data can be further influenced by 'includedAttributes' parameter.

filters - are the search criteria. The returned set contains only data records complying with these search criteria. If there are more filters, they are considered to be combined together by the AND relational operator. If there are no search criteria (meaning that all records for a particular data type should be returned), put here an empty array (not null). See details how to build individual filters in SearchFilter.

includedAttributesIdentifiers - defines how full (how much populated) the returned data are. It is a list of attribute names that a caller wants to be in the returned data objects. If the parameter is an empty array, or null, the specific data source implementation must determine how fully populated the returned objects should be. The object should at least contain a name and a unqiue identifier.

options - add some additional features to the returned data set. You should use only options that are supported for a particular returned type - you can find them by the getSupportedOptions() method. The options names are defined in the Option ontology. If not required, this parameter may be set to an empty java.util.Map or null.

Returns:
a list of data instances from this data source, complying with the search criteria as specified in the 'filters' parameter. The list has elements of type as specified in the 'dataTypeIdentifier' parameter. Return an empty list (not null) if there are no complying data.

Important: This specification allows to return a list where some elements are null. The caller should just ignore these elements. Their existence indicates that there are more data compliant with the search filters but from some reasons they cannot be returned.

Throws:
java.lang.IllegalArgumentException - if the given 'dataTypeIdentifier' identifies a data types that is not provided by this data source, search filters contain unsupported attributes and/or operators and/or invalid value.

Regarding the invalid value, this exception can be raised only if the "validity" of a value is obvious and can be deduced from the operator used for this value, or from the type of the searchable attribute. For example, a non-numeric value used in conjuction with the "greater-than" and a numeric attribute would be considered an invalid value. In cases, however, where it is not obvious what attribute is going to be searched - which is the case for "all reasonable" attributes - see ontology for explanation - an invalid value (a value that does not fit with the attribute type) is simply ignored without raising this exception.

GCPException - if data cannot be obtained (e.g. because of network or database access problems). Please use here an error message as detailed as possible - but be also aware that this message is supposed to be shown to the end-users. Your internal details, therefore, should be put rather into a log file.

Version: 3.5.0

Submit a bug or feature
Generated: Fri Jul 23 18:24:48 CDT 2010