org.apache.solr.hadoop.dedup
Interface UpdateConflictResolver
- All Known Implementing Classes:
- NoChangeUpdateConflictResolver, RejectingUpdateConflictResolver, RetainMostRecentUpdateConflictResolver, SortingUpdateConflictResolver
public interface UpdateConflictResolver
Interface that enables deduplication and ordering of a series of document
updates for the same unique document key.
For example, a MapReduce batch job might index multiple files in the same job
where some of the files contain old and new versions of the very same
document, using the same unique document key.
Typically, implementations of this interface forbid collisions by throwing an
exception, or ignore all but the most recent document version, or, in the
general case, order colliding updates ascending from least recent to most
recent (partial) update.
The caller of this interface (i.e. the Hadoop Reducer) will then apply the
updates to Solr in the order returned by the orderUpdates() method.
Configuration: If an UpdateConflictResolver implementation also implements
Configurable then the Hadoop Reducer will call
Configurable.setConf(org.apache.hadoop.conf.Configuration) on
instance construction and pass the standard Hadoop configuration information.
|
Method Summary |
Iterator<SolrInputDocument> |
orderUpdates(org.apache.hadoop.io.Text uniqueKey,
Iterator<SolrInputDocument> collidingUpdates,
org.apache.hadoop.mapreduce.Reducer.Context context)
Given a list of all colliding document updates for the same unique document
key, this method returns zero or more documents in an application specific
order. |
orderUpdates
Iterator<SolrInputDocument> orderUpdates(org.apache.hadoop.io.Text uniqueKey,
Iterator<SolrInputDocument> collidingUpdates,
org.apache.hadoop.mapreduce.Reducer.Context context)
- Given a list of all colliding document updates for the same unique document
key, this method returns zero or more documents in an application specific
order.
The caller will then apply the updates for this key to Solr in the order
returned by the orderUpdate() method.
- Parameters:
uniqueKey - the document key common to all collidingUpdates mentioned belowcollidingUpdates - all updates in the MapReduce job that have a key equal to
uniqueKey mentioned above. The input order is unspecified.context - The Context passed from the Reducer
implementations.
- Returns:
- the order in which the updates shall be applied to Solr
Copyright © 2000-2014 Apache Software Foundation. All Rights Reserved.