About Parameters and Parameter Files

ECJ relies heavily on parameter files for nearly every conceivable parameter setting. It even relies on parameter files to determine which classes to use in diffent places. This means that understanding parameters and parameter files is crucial to using ECJ.

Parameters

ECJ's parameters are written one to a line in Java property-list style. They may be in one of the two following formats:

parametername = value
parametername value

The second option is deprecated, please don't do it. Whitespace is stripped. Parameter values may contain internal whitespace but parameter names may not. Blank lines and lines beginning with a "#" are ignored. Parameter names and values are case-sensitive.

Parameter values are interpreted as one of five data types, depending on the parameter:

Parameter Files

ECJ reads parameters from a hierarchical set of parameter files, typically called "params" or ending with the extension ".params". When you start up ECJ, you specify a parameter file as such:

java ec.Evolve -file myParameterFile

Parameter files can have multiple parents which define additional parameters. A parameter file specifies that it has a parent with a special parameter:

parent.n = parentFile

...where n indicates that the parent is parent #n. n starts at 0 and increases. Your parents must be assigned with consecutive parameter names starting with parent.0. For example:

parent.0 = ../../myFirstParent.params
parent.1 = ../../../mySecondParent.params
parent.2 = ../foo/bar/myThirdParent.params

Precedence

Parameters may also be defined on the command line when running ECJ with the "-p" option, which may appear multiple times. No space may appear between the parameter name, "=", and value. For example:

java ec.Evolve -file my.params -p extraparam=extravalue -p anotherparam=anothervalue

Parameters may further be programatically defined internally by the system, though ECJ presently never does this. If you have two parameters with the same name, here are the rules guiding which ones take precedence:

ECJ's Parameter Style

Since numerous objects read parameters from the parameter database, ECJ organizes its parameter namespace hierarchically using periods to separate elements in parameter names. Let's begin with the simplest situation: someECJ parameters are simple global parameters. For example,

evalthreads = 4

...tells ECJ that it should spawn 4 threads when doing population evaluation. Other parameters are organized hierarchically because it's cleaner that way. For example, if evalthreads and breedthreads are both 4, then there are 4 seeds for the random number generator which must be defined. They are defined as such (Note the period between seed and the number n):

seed.0 = 2341
seed.1 = 7234123
seed.2 = 411
seed.3 = 34021239

It's common for arrays of objects are defined like this, with numbers representing their position.

The period is used for other hierarchical purposes. When an object contains other objects as subordinates, they fall within its hierarchy. Such objects have a parameter base which is prefixed to them. For example, the global Population instance contains an array of Subpopulation instances, each of which in turn contain a variety of objects. Here's how the Population instance is defined, the number of subpops it contains is set, the classes for its various subpopulations are defined, and the number of individuals each one has is set:

# We're doing some coevolution, so we need two
# subpopulations, each with 500 individuals
pop = ec.Population
pop.subpops = 2
pop.subpop.0 = ec.Subpopulation
pop.subpop.0.size = 500
pop.subpop.1 = ec.Subpopulation
pop.subpop.1.size = 500

Note that the parameters for each subpopulation begin with the parameter base pop.subpop.n. Each Subpopulation instance requests a "size" relative to its current parameter base handed it by its "controlling" object. As you might guess, these hierarchical bases can get very long.

If an object needs a given parameter, and the parameter does not exist with the provided base, then the object can check a default base for the parameter. For example, let's say that breeding pipeline #0 of the species for subpopulation #1 of the population is a MutationPipeline (GP point mutation) and is using Tournament Selection as it's source #0 to select individuals. It might declare some information thusly:

pop.subpop.1.species.pipe.0 = ec.gp.koza.MutationPipeline

pop.subpop.1.species.pipe.0.source.0 = ec.select.TournamentSelection

...we can custom-define the tournament size parameter by tacking it onto this base as:

pop.subpop.1.species.pipe.0.source.0.size = 7

...or we can fall back on a "default" setting for this parameter for all Tournament Selection objects as:

select.tournament.size = 2

...In this case the hierarchical parameter base is pop.subpop.1.species.pipe.0.source.0 and the "default base" for Tournament Selection is select.tournament. If the object looks both places and still can't find a parameter defined (or it's improperly defined), it will issue an error. Some global objects don't have default parameter bases, but most every object which can be repeatedly declared in different places will have a default base.

In general, objects which read parameters fall into one of several classes:

Tracing Bases Through Class Documentation

The class documentation contains three tables which give information about parameters and parameter bases for instances of that class. The Parameters table indicates the valid parameters declared for that instance. The Default Base indicates the class's default base, if any. The Parameter Bases table indicates the new parameter bases for subsidiary objects to this instance. For example, here's the tables from ec.gp.koza.MutationPipeline, the class responsible for doing the GP point mutation operator:

     

Parameters
base.tries
int >= 1
(number of times to try finding valid pairs of nodes)
base.maxdepth
int >= 1
(maximum valid depth of a crossed-over subtree)
base.ns
classname, inherits and != GPNodeSelector
(GPNodeSelector for tree)
base.build.0
classname, inherits and != GPNodeBuilder
(GPNodeBuilder for new subtree)
equal
bool = true or false (default)
(do we attempt to replace the subtree with a new one of roughly the same size?)
base.tree.0
0 < int < (num trees in individuals), if exists
(tree chosen for mutation; if parameter doesn't exist, tree is picked at random)

Default Base
gp.koza.mutate

Parameter bases
base.ns
nodeselect
base.build
builder

MutationPipeline is derived from ec.BreedingPipeline, which adds the following tables:

     

Parameters
base.num-sources
int >= 1
(User-specified number of sources to the pipeline. Some pipelines have hard-coded numbers of sources; others indicate (with the java constant DYNAMIC_SOURCES) that the number of sources is determined by this user parameter instead.)
base.source.n
classname, inherits and != BreedingSource, or the value same
(Source n for this BreedingPipeline. If the value is set to same, then this source is the exact same source object as base.source.n-1, and further parameters for this object will be ignored and treated as the same as those for n-1. same is not valid for base.source.0)

Parameter bases
base.source.n
Source n

ec.BreedingPipeline in turn is derived from ec.BreedingSource, which adds the following tables:

     

Parameters
base.prob
0.0 <= float <= 1.0, or undefined
(probability this BreedingSource gets chosen. Undefined is only valid if the caller of this BreedingSource doesn't need a probability)

Although MutationPipeline inherits all these parameters, the parameter base for all of them is the instance's parameter base handed it by its controller object. And the default base for all of them is always the last one defined (in this case, "gp.koza.mutate". Default bases for parent classes are not used.

Back to our original example, imagine that we had a MutationPipeline used as breeding pipeline #0 of the species used in subpopulation #1 of the population:

pop.subpop.1.species.pipe.0 = ec.gp.koza.MutationPipeline

We could specify a probability for this pipeline as:

pop.subpop.1.species.pipe.0.prob = 0.9

...or we might specify a default probability (not necessarily a good idea) for all MutationPipelines as:

gp.koza.mutate.prob = 0.4

MutationPipeline contains two subsidiary instances, one which subclasses from gp.GPNodeSelector, and one which subclasses from gp.GPNodeBuilder. The first is responsible for picking a subtree to mutate, and the second is responsible for creating a new subtree. We specify classes for those instances in their parameters (we'll use a KozaNodeSelector and a GrowBuilder):

pop.subpop.1.species.pipe.0.ns.0 = ec.gp.koza.KozaNodeSelector
pop.subpop.1.species.pipe.0.build.0 = ec.gp.koza.GrowBuilder

Of course, we might provide default choices as well:

gp.koza.mutate.ns.0 = ec.gp.koza.KozaNodeSelector
gp.koza.mutate.build.0 = ec.gp.koza.GrowBuilder

These two objects have parameters to set up as well. Their parameter bases are specified as base.ns and base.build respectively. In this case, it means that their parameter bases are pop.subpop.1.species.pipe.0.ns.0 and pop.subpop.1.species.pipe.0.build.0. And thus the cycle of life continues. For example, KozaNodeSelectors have default base of gp.koza.ns and a root parameter which specifies the probability they'd pick the root of a tree. The root parameter would then be found at pop.subpop.1.species.pipe.0.ns.0.root, or the default value at gp.koza.ns.root.

Where to look for specifics about parameters

There are way too many possible parameters to discuss here. Here are some places to start digging.

Parameters currently used by Symbolic Regression

Some are global parameters, some are defined through the parameter base hierarchy, and some are defined through default bases. The parameter files are app/regression/noerc.params, its parent gp/koza/params, and its parent simple/params.

Number of threads and random number generator seeds
breedthreads = 1
evalthreads = 1
seed.0 = 4357

Garbage collection
gc = false
aggressive = true
gc-modulo = 1

Checkpointing
checkpoint = false
checkpoint-modulo = 1
prefix = ec

Outputting Stuff
nostore = false
flush = true
verbosity = 0

The EvolutionState Object
state = ec.simple.SimpleEvolutionState

Evolution Parameters
generations = 51
quit-on-run-complete = true

The Initializer, Breeder, Exchanger, and Finisher
breed = ec.simple.SimpleBreeder
exch = ec.simple.SimpleExchanger
finish = ec.simple.SimpleFinisher
init = ec.gp.GPInitializer

The Evaluator and the Problem (ADF stuff is always loaded but not used in this case)
eval = ec.simple.SimpleEvaluator
eval.problem = ec.app.regression.Regression
eval.problem.data = ec.app.regression.RegressionData
eval.problem.stack = ec.gp.ADFStack
eval.problem.stack.context = ec.gp.ADFContext
eval.problem.stack.context.data = ec.app.regression.RegressionData

The Statistics
stat = ec.gp.koza.KozaStatistics
stat.file = $out.stat

Default Tournament Selection tournament size
select.tournament.size = 7

Default HalfBuilder (ramped half/half tree building) parameters
gp.koza.half.growp = 0.5
gp.koza.half.max-depth = 6

Default KozaNodeSelector parameters
gp.koza.ns.nonterminals = 0.9
gp.koza.ns.root = 0.0
gp.koza.ns.terminals = 0.1

Default Reproduction operator parameters
gp.koza.reproduce.source.0 = ec.select.TournamentSelection

Default Crossover operator parameters 
gp.koza.xover.maxdepth = 17
gp.koza.xover.ns.0 = ec.gp.koza.KozaNodeSelector
gp.koza.xover.ns.1 = same
gp.koza.xover.source.0 = ec.select.TournamentSelection
gp.koza.xover.source.1 = same
gp.koza.xover.tries = 1

Function Sets (there's only one)
gp.fs.size = 1
gp.fs.0 = ec.gp.GPFunctionSet
gp.fs.0.info = ec.gp.GPFuncInfo
gp.fs.0.name = f0
gp.fs.0.size = 9
gp.fs.0.func.0 = ec.app.regression.func.X
gp.fs.0.func.0.nc = nc0
gp.fs.0.func.1 = ec.app.regression.func.Add
gp.fs.0.func.1.nc = nc2
gp.fs.0.func.2 = ec.app.regression.func.Mul
gp.fs.0.func.2.nc = nc2
gp.fs.0.func.3 = ec.app.regression.func.Sub
gp.fs.0.func.3.nc = nc2
gp.fs.0.func.4 = ec.app.regression.func.Div
gp.fs.0.func.4.nc = nc2
gp.fs.0.func.5 = ec.app.regression.func.Sin
gp.fs.0.func.5.nc = nc1
gp.fs.0.func.6 = ec.app.regression.func.Cos
gp.fs.0.func.6.nc = nc1
gp.fs.0.func.7 = ec.app.regression.func.Exp
gp.fs.0.func.7.nc = nc1
gp.fs.0.func.8 = ec.app.regression.func.Log
gp.fs.0.func.8.nc = nc1

Standard Node Constraints for untyped GP with nodes of various arity sizes
gp.nc.size = 7
gp.nc.0 = ec.gp.GPNodeConstraints
gp.nc.0.name = nc0
gp.nc.0.returns = nil
gp.nc.0.size = 0
gp.nc.1 = ec.gp.GPNodeConstraints
gp.nc.1.name = nc1
gp.nc.1.returns = nil
gp.nc.1.size = 1
gp.nc.1.child.0 = nil
gp.nc.2 = ec.gp.GPNodeConstraints
gp.nc.2.name = nc2
gp.nc.2.returns = nil
gp.nc.2.size = 2
gp.nc.2.child.0 = nil
gp.nc.2.child.1 = nil
gp.nc.3 = ec.gp.GPNodeConstraints
gp.nc.3.name = nc3
gp.nc.3.returns = nil
gp.nc.3.size = 3
gp.nc.3.child.0 = nil
gp.nc.3.child.1 = nil
gp.nc.3.child.2 = nil
gp.nc.4 = ec.gp.GPNodeConstraints
gp.nc.4.name = nc4
gp.nc.4.returns = nil
gp.nc.4.size = 4
gp.nc.4.child.0 = nil
gp.nc.4.child.1 = nil
gp.nc.4.child.2 = nil
gp.nc.4.child.3 = nil
gp.nc.5 = ec.gp.GPNodeConstraints
gp.nc.5.name = nc5
gp.nc.5.returns = nil
gp.nc.5.size = 5
gp.nc.5.child.0 = nil
gp.nc.5.child.1 = nil
gp.nc.5.child.2 = nil
gp.nc.5.child.3 = nil
gp.nc.5.child.4 = nil
gp.nc.6 = ec.gp.GPNodeConstraints
gp.nc.6.name = nc6
gp.nc.6.returns = nil
gp.nc.6.size = 6
gp.nc.6.child.0 = nil
gp.nc.6.child.1 = nil
gp.nc.6.child.2 = nil
gp.nc.6.child.3 = nil
gp.nc.6.child.4 = nil
gp.nc.6.child.5 = nil

Tree Constraints
gp.tc.size = 1
gp.tc.0 = ec.gp.GPTreeConstraints
gp.tc.0.init = ec.gp.koza.HalfBuilder
gp.tc.0.name = tc0
gp.tc.0.returns = nil

GP Types
gp.type.a.size = 1
gp.type.a.0.name = nil
gp.type.s.size = 0

The Population, and its one subpopulation, species, breeding pipelines and individuals
pop = ec.Population
pop.subpops = 1
pop.subpop.0 = ec.Subpopulation
pop.subpop.0.duplicate-retries = 100
pop.subpop.0.fitness = ec.gp.koza.KozaFitness
pop.subpop.0.size = 1000
pop.subpop.0.species = ec.gp.GPSpecies
pop.subpop.0.species.ind = ec.gp.GPIndividual
pop.subpop.0.species.ind.numtrees = 1
pop.subpop.0.species.ind.tree.0 = ec.gp.GPTree
pop.subpop.0.species.ind.tree.0.tc = tc0
pop.subpop.0.species.numpipes = 2
pop.subpop.0.species.pipe.0 = ec.gp.koza.CrossoverPipeline
pop.subpop.0.species.pipe.0.prob = 0.9
pop.subpop.0.species.pipe.1 = ec.gp.koza.ReproductionPipeline
pop.subpop.0.species.pipe.1.prob = 0.1