E - the type of objects handled.public class XValidatingObjectCorpus<E> extends Corpus<ObjectHandler<E>> implements ObjectHandler<E>, Serializable
XValidatingObjectCorpus holds a list of items
which it uses to provide training and testing items using
cross-validation.
handle(Object) is used to add items to the
corpus. The items will be stored in the order in which they
are received (though they may be permuted later).
When used as a handler, this class simply collects the items and stores them in a list. This allows an instance of this class to be used like any other object handler.
Initially, the fold will be set to 0, but the fold may be reset
later using setFold(int). Iterating between 0 and the
number of folds minus 1 will work through all folds. The method
size() returns the size of the corpus and fold()
is the current fold.
For cases where numFolds() is greater than zero,
the start and end of a fold are defined by:
Ifstart(fold) = (int) (size() * fold() / (double) numFolds()) end(fold) = start(fold+1)
numFolds() is 0, the start and end for the fold
are 0, so that visiting the training part of the corpus visits
the entire corpus.
The randomization method permuteCorpus(Random) randomizes
the list of items. This can be useful for removing local
dependencies. See the section on thread safety below for more
information on the interaction of permutation and thread safety.
Corpus.visitCorpus(Handler) will run the specified handler over all of
the data collected in this corpus.
If the number of folds is set to 1, then Corpus.visitTest(Handler) visits the entire corpus.
If the number of folds is set to 0, Corpus.visitTrain(Handler)
visits the entire corpus.
This class must be used with external read/write synchronization. The write operations include the constructor, set-fold, set number of folds, permute corpus, and handle methods. The read operations include the visit num instances and fold reporting methods.
Specifically, if the corpus is not being written to, folds may be visited concurrently.
handle(), setFold(), setNumFolds(),
and permuteCorpus() being the writers.
itemView() returns a view of a corpus with an immutable item list.
But it allows the number of folds and fold to be set. In
particular, as long as the underlying corpus is not modified,
a view for each fold may be created and run concurrently.
If a common evaluator is used, access to it must be synchronized to set the appropriate model and run the evaluation. If a separate evaluation is used per thread, there is no need for synchronization.
XValidatingObjectCorpus may be serialized. The
corpus read back in will have the same items in the same
permutatino, with the same number of folds and the same fold
set as the corpus at the point it was serialized.| Constructor and Description |
|---|
XValidatingObjectCorpus(int numFolds)
Construct a cross-validating corpus with the specified
number of folds.
|
| Modifier and Type | Method and Description |
|---|---|
int |
fold()
Returns the current fold.
|
void |
handle(E e)
Add the specified item to the end of the corpus.
|
XValidatingObjectCorpus<E> |
itemView()
Returns a cross-validating corpus whose items are an immutable
view of the items in this corpus, but whose number of folds
or fold may be changed.
|
int |
numFolds()
Return the number of folds for this cross-validating corpus.
|
void |
permuteCorpus(Random random)
Randomly permutes the corpus using the specified randomizer.
|
void |
setFold(int fold)
Set the current fold to the specified value.
|
void |
setNumFolds(int numFolds)
Sets the number of folds to the specified value.
|
int |
size()
Return the number of items in this corpus.
|
void |
visitCorpus(ObjectHandler<E> handler)
Visit the entire corpus, sending all extracted events to the
specified handler.
|
void |
visitCorpus(ObjectHandler<E> trainHandler,
ObjectHandler<E> testHandler)
Visit the entire corpus, first sending training events to the
specified training handler and then sending testing events to
the test handler.
|
void |
visitTest(ObjectHandler<E> handler)
Send all of the test items to the specified
handler.
|
void |
visitTest(ObjectHandler<E> handler,
int fold)
Visit the test portion of the specified fold with the
specified handler.
|
void |
visitTrain(ObjectHandler<E> handler)
Send all of the training items to the specified
handler.
|
void |
visitTrain(ObjectHandler<E> handler,
int fold)
Visit the training portion of the specified fold with the
specified handler.
|
public XValidatingObjectCorpus(int numFolds)
See the class documentation above for information on how the number of folds is used.
numFolds - Number of folds in the corpus.IllegalArgumentException - If the number of folds is
negative.public XValidatingObjectCorpus<E> itemView()
Attempts to modify the items or their order using handle() or permuteCorpus() will raise an UnsupportedOperationException (note that permuting a
zero-length or length one list does not modify it, so permuting
an unmodifiable length one list does not raise an unsupported
opration exception.
public int numFolds()
public void setNumFolds(int numFolds)
See the class documentation above for information on how the number of folds is used.
numFolds - Number of folds.IllegalArgumentException - If the number of folds is
negative.public int fold()
public void permuteCorpus(Random random)
random - Randomizer to use for permutation.public void setFold(int fold)
Warning: If the number of folds is set to zero, this method will throw an exception.
IllegalArgumentException - If the fold is not greater than
or equal to 0 and less than the number of folds.public int size()
public void handle(E e)
handle in interface ObjectHandler<E>e - Item to add to corpus.public void visitTrain(ObjectHandler<E> handler)
visitTrain in class Corpus<ObjectHandler<E>>handler - Handler receiving training items.public void visitTest(ObjectHandler<E> handler)
visitTest in class Corpus<ObjectHandler<E>>handler - Handler receiving training items.public void visitCorpus(ObjectHandler<E> handler)
CorpusThis is just a convenience method that is defined by:
visitCorpus(handler,handler);
visitCorpus in class Corpus<ObjectHandler<E>>handler - Handler for events extracted from the corpus.public void visitCorpus(ObjectHandler<E> trainHandler, ObjectHandler<E> testHandler)
CorpusThis is just a convenience method that is defined by:
visitTrain(trainHandler); visitTest(testHandler);
visitCorpus in class Corpus<ObjectHandler<E>>trainHandler - Handler for training events from the corpus.testHandler - Handler for testing events from the corpus.public void visitTest(ObjectHandler<E> handler, int fold)
This method ignores the value of the current fold.
handler - Handler for objects in corpus.fold - Fold whose test portion is visited.public void visitTrain(ObjectHandler<E> handler, int fold)
This method ignores the value of the current fold.
handler - Handler for objects in corpus.fold - Fold whose training portion is visited.Copyright © 2016 Alias-i, Inc.. All rights reserved.