This is a non comprehensive guide to the FCUBE project code. If you are interested in knowing more about the project or if you want to continue its development, please contact us by email at iarnaldo@mit.edu
We provide a tomcat application hosted on the FCUBE server that allows to interact with FCUBE via a web interface. The web application wraps FCUBE's functionality, and is meant to be example of what can be done to facilitate the use of the FCUBE platform. However, the current release does not exploit all the features of FCUBE, so we encourage users to build their own applications according to their own needs.
The web application is also meant to be an example on how the FCUBE java executable needs to be used and how the project environment (folder structure) needs to be set. Please check the folder /usr/share/tomcat/ on a FCUBE server instance to check the required folder structure.
The code of the FCUBE project is composed of:
The same executable is used, on the FCUBE instance side, to sample the training data and the parameters of the learners (factoring).
Below, we describe in detail the command-line interfaces of the FCUBE server, that is, the commands necessary to deploy learners, and retrieve and fuse models. This functionality has only been tested in Debian environments and relies on the AWS CLI tools, that need to be installed and configured with your own access and secret keys.
The learning strategy is specified in a file sent to all the FCUBE instances deployed within a FCUBE run.
There are two built-in parameters for data management:
This file can also be used to specify ranges of values for learner-specific parameters such as learning rate, crossover rate etc. For detailed information on how to specify parameter ranges please check this documentation. We show and comment an example for our GPFunction learner (a learner inspired on Evolutionary Computation):
fixed data fcube/higgs/train
fixed threads 2
data_sample_rate float discreteSet default ( 1 ) { 0.1 ; 0.2 }
variable_sample_rate float discreteSet default ( 1 ) { 0.25 ; 0.75 ; 1 }
false_negative_weight float range default ( 0.5 ) [ 0.4 : 0.05 : 0.6 ]
xover_op string discreteSet default ( SPUCrossover ) { SPUCrossover ; KozaCrossover }
pop_size int discreteSet default ( 1000 ) { 1000 ; 1500 ; 2000 }
factoredParams {data_sample_rate, variable_sample_rate, xover_op, pop_size}
The path to the data and the number of threads are declared as fixed parameters. The built-in parameters for data management (data_sample_rate and variable_sample_rate) as well as the crossover operator and population size are all assigned a discrete set of choices. The learner-specific parameter indicating false negative weight is assigned a range of possible values. Finally, the instruction in the last line indicates the parameters that will be factored (stochastically selecting a value from the possible choices). Only the false negative weight will be set to its default value.
Command:
$ java -jar fcube.jar -deploy gpfunction -n 40 -minutes 60 -key_name nachokey -options ruletree_factoring.options -flavor ec2_instance_type
where:
Command:
$ java -jar fcube.jar -retrieve mostAccurate.txt -keypairPath certs/nachokey.pem -learners gpfunction
where:
Command:
$ java -jar fcube.jar -filter-fuse higgs-alfa_9.csv higgs-alfa_10.csv -model mostAccurate.txt -fnweight 0.47 -learners gpfunction
where: