This is statistics 101 review time. How many permutations are we talking about here? The method of calculating this figure is to multiply all the possible values together. Mathematically, 6 x 6 x 4 = 144 possible parameter sets. Pretty tame, but we're trying to keep it simple for now. We are also using a small number to illustrate a problem we are now faced with. During the backtest, our data took up a little over 3,000 rows, which corresponds to one trade per row. This is for a single parameter set. Once we expand our survey to include 144 parameter sets, we have increased the data file to an almost unmanageable length. Though every parameter set is not going to trigger the same number of trades, we are well on our way to 1/2 a million lines of data with just a simple 144 permutation optimization. This is not going to work for us so we need to make a compromise. Instead of taking a granular view of every trade, we are going to make some generalizations. How did the trades withing a particular parameter set within a particular market perform? This gives us one line of data per parameter set for a total of 144. Times that by 47 markets and we are in a manageable 6,738 lines of data.
I use TraderStudio to generate the optimization data and for each market it spits out certain columns of data, including the definitive parameter set column. Instead of trying to combine all 47 markets into a single file, I've opted to read-in (the R term for recognizing data) 47 separate files. The code looks like this:
################### Read Data
################### mark f
AN <- read.csv ("C:/R/BUMBL.WHITE.OPTIX/BUMBL.WHITE.OPTIX.AN.csv", skip=3)
BO <- read.csv ("C:/R/BUMBL.WHITE.OPTIX/BUMBL.WHITE.OPTIX.BO.csv", skip=3)
BN <- read.csv ("C:/R/BUMBL.WHITE.OPTIX/BUMBL.WHITE.OPTIX.BN.csv", skip=3)
.
################### mark f
AN <- read.csv ("C:/R/BUMBL.WHITE.OPTIX/BUMBL.WHITE.OPTIX.AN.csv", skip=3)
BO <- read.csv ("C:/R/BUMBL.WHITE.OPTIX/BUMBL.WHITE.OPTIX.BO.csv", skip=3)
BN <- read.csv ("C:/R/BUMBL.WHITE.OPTIX/BUMBL.WHITE.OPTIX.BN.csv", skip=3)
.
.
.
ZZ <- read.csv ("C:/R/BUMBL.WHITE.OPTIX/BUMBL.WHITE.OPTIX.ZZ.csv", skip=3)
That's a lot of typing, but they invented this thing called copy and paste which makes the process a lot easier. May I also recommend looking into a good text editor if you're going to pursue this. I've chosen VIM and though it has a steep learning curve, it's really cool.
One view of this data is to see how many of the total 144 parameter sets were actually profitable. Did only one show a profit? Not good. If 121 showed a profit, then now we're talking. Your spreadsheet (data frame in R) is not going to look the same as mine unless you're using TradersStudio so I offer the next piece of code as a general guideline. I'm extracting a column (vector in R) of data from the whole file and creating an object that I can then manipulate.
################## Define Variables from existing vectors
################## the convention is MARKET.CAPS mark a
AN.NP <- AN$NetProfit
BO.NP <- BO$NetProfit
BN.NP <- BN$NetProfit
.
.
.
ZZ.NP <- ZZ$NetProfit
Now we further manipulate this simple vector of data to derive a percentage of profitability. I'm only going to include one market since by now you get the idea that there are actually 47 markets.
AN.npro <- AN.NP [AN.NP>0] # a subset of rows with values greater than zero
AN.nprolen <- length (AN.npro) # how long is this subset?
AN.len <- length (AN.NP) # how long is the original vector (we know this is 144 in our case)
AN.pp <- AN.nprolen / AN.len # create a new object that represents percentage of profitable sets
pp <- c(AN.pp*100, ... ZZ.pp*100) # a list comprised of each market's percent profitable sets
hist (pp, ylab="Number of Markets", col="whitesmoke",breaks=4,main="Optimization of White Bumblebee",xlab="What Percentage of All Parameter Sets Were Profitable?") # plot the histogram
I've also added some text to the graph with the following piece of code:
text (9,7,less20, cex=.4) # x axis location =9, y =7, object name and magnification
text (29,4,less40, cex=.4)
text (49,4,less60, cex=.4)
text (69,4,less80, cex=.4)
text (89,12,less99, cex=.4)
I have not included the code for the object 'less20' but it follows the same variable defining method. If you really wan to see it, let me know and I'll send it to you. But for most of us, we've had more than enough code so let's see the final product, please.
Here it is:
This is only one view of five I'm planning for Optimization Profile 1.0. Next up is profit expectancy in the place of net profit, and then some other metrics. But in the mean time, we have the basic framework for a working cool graphics engine.

1 comments:
Wow - pretty cool use of R to make sense of all this backtest and optimisation data.
I am personally getting a bit tired of doing it semi-manually in Excel... :(
Post a Comment