Wednesday, January 30, 2008

Tick Databases

Ok, so the street has largely embraced KDB as a tick database for high-frequency algo. It is by far the worst DB I have every worked with in terms of reliability and functionality. The one thing it does have going for it is that it is fast.

Other than that, if you want to deal with the agony of a database that falls over on simple operations, has a largely unintelligable language Q (a dialect of APL) that noone cares for, and has next to nothing in the way of support from the vendor, be my guest.

KDB is a very (very!) raw database, not much more than a process or two around a a bunch of binary files, one for each timeseries per day.

Stay away.

Sunday, January 20, 2008

Evolution of Distribution

For a number of problems, understanding the evolution of the distribution over time is important. The distribution tends to be stable over longer periods and unstable over shorter periods.

The distribution is going to be measured from a sample over some time period. One may want to take a blend of distributions measured over different time periods, combined as basis functions with weights summing to 1.

The interesting bit is predicting the distribution forward with some statistical accuracy. The order book and momentum indicators should tell us something about how the distribution is going to transform over the next period or based on when a certain price level is achieved.

We are going to use a GA to calibrate the transformation function against historical data. There are many different functions we could use, so we use a GP approach to play with the permutations.

GP for option pricing

As you probably know GP (Genetic Programming) is an extension of GA which rearranges algebraic or functional instruction trees to fit to a solution.


I had not thought of it previously, but could use such an approach with the right set of functional constructors to converge on an option pricing GP. Now if all we were trying to do was to replicate the Black / Scholes, CEV, or other gaussian distribution based model, would not be very interesting.

We know that the actual distribution are often non-gaussian. Could we produce a more accurate approximation of the hedging cost against a non-gaussian distribution (implying the true risk free price of the option) with GP?

Interestingly, Neural Networks are just special cases of a GP tree, so in the end GP is the most general approach to non-linear regression.