v.kcv - Randomly partition points into test/train sets.
v.kcv [-dq] input=name output=name k=integer column=string [--overwrite] [--verbose] [--quiet]
- Use drand48()
- Allow output files to overwrite existing files
- Verbose module output
- Quiet module output
- Name of input vector map
- Name for output vector map
- Number of partitions
- Options: 1-32767
- Name for new column to which partition number is written
- Default: part
v.kcv randomly divides a points lists into k sets of
test/train data (for k-fold cross validation).
Test partitions are mutually exclusive. That is, a point will
appear in only one test partition and k-1 training partitions.
The program generates a random point using the selected
random number generator and then finds the closest point to
it. This site is removed from the candidate list (meaning
that it will not be selected for any other test set) and
saved in the first test partition file. This is repeated
until enough points have been selected for the test partition.
The number of points chosen for test partitions
depends upon the number of sites available and the number
of partitions chosen (this number is made as consistent as
possible while ensuring that all sites will be chosen for
testing). This process of filling up a test partition is
done k times.
An ideal random sites generator will follow a Poisson dis
only be as random as the original points. This program
simply divides points up in a random manner.
Be warned that random number generation occurs over the
intervals defined by the region of the map.
This program may not work properly with Lat-long data.
James Darrell McCauley
when he was at:
Update to 5.7 Radim Blazek 10 / 2004
Last changed: $Date: 2006-01-02 06:44:52 -0800 (Mon, 02 Jan 2006) $
Main index - vector index - Full index
© 2003-2008 GRASS Development Team