Documentation completion continues to be the priority theme in this release of 3 March 2010. Additionally, a number of changes and enhancements are created in the process:
-
Implemented support for templating. Towards this end, a TemplateTopLevel option for SymbolicRegression was implemented which facilitates forcing a desired output form — e.g., a conditional, exponential, etc. — in the generated models. The Crossover, MutateSubtree and DepthPreservingSubtreeMutation PropagationOperators were modified to support the preservation of any embedded templates. However, only the top-level pattern is viewed as sacred.
-
Implemented a ResponsePlot function which is similar to ResponseSurfacePlot except that variables are plotted individually as 2D rather than as all 3D pairwise combinations. This is useful to get a quick overview of the response behavior when models or ensembles feature many input variables. As with ResponseSurfacePlot, the settings for the model variables which are not being plotted can greatly affect both the scale and response behavior. To address this, a CommonPlotRange option was introduced which will place all of the synthesized graphics on the same vertical scale.
-
Deleted the ResponseSurfaceParameters option for ResponseSurfacePlot, DivergenceSurfacePlot (and ResponsePlot) with each now using the DataVariableRange and (newly introduced) DataVariableReference option. Valid settings for the DataVariableReference (which specifies the setting for all DataVariables not being modified in a given graphic) are: a specified point, Automatic (which uses the midpoint of the DataVariableRange), Random (which generates a random point in parameter space), ModelMaximum or ModelMinimum. The latter two settings will search for the appropriate extramal response points and use those.
-
Added a CreateLinearModel function which creates a GPModel using the supplied or synthesized BasisSet. This is useful for creating reference conventional models for comparison to SymbolicRegression results.
-
Modified RandomGenomes and RandomModels to speed up model synthesis as well as increase the diversity of models synthesized. Five new options were implemented (TemplateTopLevel, BalancedTemplates, TemplateFunctionCount, TemplateDepth and SynthesisDepth) with AllowAtomicGenomes deleted. MinimumTreeDepth and MaximumTreeDepth now only apply to ExtractGenomeSubtrees.
-
Introduced BuildFunctionPatterns which uses FunctionPatternSynthesisRules (default associated with SymbolicRegression) to generate appropriate input for the FunctionPatterns option for SymbolicRegression. Several pattern sets ("BasicMath", "ExtendedMath", PowerMath" etc.) have been pre-defined which can easily be mixed and extended to tailor the building blocks to the appliction characteristics. This is actually a really slick implementation since it allows the user to easily tweak the functional building blocks used in the model development.
-
Fixed a sin-of-omission so now RandomModels and RandomGenomes can handle all valid forms for the PopulationSize option. If a list of integers is supplied, the first number will be used as the targeted size.
-
Removed the Unique option for RandomModels since it was obsolete.
-
Modified the default FunctionPatterns so that summation and multiplication in RandomModels will have at least two arguments (and up to a MaximumArity of 5). Previously, it was easier to create models which had introns (non functional genetics) due to only having a single argument with summation and multiplication.
-
Modifed RemoveModelScaling so that any ModelingObjectiveNames in the ModelPersonality are removed along with the ModelFitness being reset.
-
Fixed a bug in introduced in Release 16.0 in RandomModels wherein the supplied variables were not properly weighted for selection during model synthesis. This would have been an issue for modeling systems with large numbers of input variables.
-
Uncovered a bug in SymbolicRegression wherein the ModelingVariables were all treated as having equal weights for RandomModel synthesis independent of any individual or class weighting.
-
Fixed a bug in MutateSubtree and DepthPreservingSubtreeMutation wherein the ModelFitness in the modified models was not being reset to Indeterminate.
-
Fixed a bug in CreateFittedEnsemble wherein SelectModels option defaults associated with CreateFittedEnsemble were not being passed through properly.
-
Fixed a bug in AlignModelExpression wherein option settings embedded in the ModelPersonality were not be used. This sin-of-omission rippled into other function; however, it did not affect the SymbolicRegression (where the model alignment typically occurs).
-
Modified the ParetoGP EvolutionStrategy so that both the archive and the final population are presented to the ResultsSelectionStrategy. This is important if a SecondaryModelingObjective has been used since moving to only considering the ModelingObjective can mean that some of the long tail models (e.g., overly complex low-dimensional models if a ModelDimensionality was used as the secondary objective) would not be of user interest.
-
Changed the default ResultsSelectionStrategy to return the 50% developed models closest to the ParetoFront from the final population (and archive). This shouldreturn the entire archive used by ParetoGP along with some other models.
-
Changed the default DataSubsetSelectionFunction to be RandomSample rather than RandomKSubset since the two are equivalent and RandomSample is about three times faster.
-
Renamed the NumberOfCascades option for SymbolicRegression to be CascadesPerEvolution. This makes its name explicit as well as as consistent with the related GenerationsPerRun, RunsPerCascade and IndependentEvolutions options.
-
Fixed a bug in MergeInputResponseData wherein if an atomic structure was supplied which did not pass an AtomQ test (e.g., \[Pi]/2), the supplied components would not be properly merged.
-
Fixed a bug in AbsoluteCorrelation wherein symbolic input would return Indeterminate even though those symbols (e.g., \[Pi]) would evaluate to being a real value. The revision also results in the implementation being even faster than using the standard Correlation function than it was before.
-
Implemented support for TerminalSet -> None in SymbolicRegression, RandomModels and RandomGenomes. This facilitates modeling when only the variables are to be used modeling.
-
Modified PolynomialBasisSet to allow PolynomialOrder, IncludeCrossTerms and IncludeConstantBasis to be supplied as options. Added the new symbols into the package documentation.
-
Modified ModelVariables (and VariablePresence when PresenceMetric -> Variables) to return the variables in the same order as produced by ModelInputVariables. This ripples into a number of other functions; however, the benefit is that model variables will be presented in the "natural order" defined by the input.
-
Implemented a Sigmoid function of the form x/(1+Abs@x). The definition of the Sigmoid is subject to change (e.g., to x/(1+x^2) or the classic (1-E^-x)/(1+E^-x)); however, this seems like a reasonable choice for a less discontinuous version of the UnitStep function