Re: “How I made $500k with machine learning and HFT (high frequency trading)”

A few days back I was browsing my tweets when I ran across this one from @news.yc:

How I made $500k with machine learning and high frequency trading” (link to actual post here, link to news.yc discussions here)

“Oh, this might be interesting”, I though to myself. So I clicked the link and started reading..

“This post will detail what I did to make approx. 500k from high frequency trading from 2009 to 2010.” That’s the first line of the post.. looks promising, doesn’t it?

But sadly, there wasn’t anything interesting. Besides briefly mentioning what was traded, there are no details regarding what was done to make any claimed profits. That’s not really describing in detail what you did to make approx. 500k from HFT, is it? There’s a lot of talk about creating software for running back tests. Sure, that’s an important part of trying out various strategies. It’s also easy to screw up, and of course would never be 100% accurate as we simply can’t predict everything. So good for you, and I know the amount of work involved with this as I’ve done that myself.

Still, it’s hard for me or anyone else to say anything about the quality of this strategy, and it doesn’t surprise me that there’s been no response from the 6 HFT funds he email. Unless the email contains a lot of actual performance measurements and other related figures of interest. And if there actually is any value in the strategy, I’m not sure I would give it away for free in a blog post, even if I had no interest in pursuing further usage.

Problem is, what was promised wasn’t delivered. It would have been one thing if he did as proclaimed and actually talked about his indicators/strategy. At least then there would be something to potentially think about. If that’s not something he’s willing to do, then at least give us some form a performance measurement. Plotting just one side of the story isn’t enough, and I’m not willing to try and figure that out with the highly limited set of data available. At least include a related benchmark of some sort. Even better, do some form of alpha/beta estimation. That would be a nice start.


  1. Hi, I’m the author. I stopped running my program in 2010 so I had no reason not to post this.

    I have a feeling that what you are familiar with is different than what what I was doing. My “strategy” was simply to get prices where I had positive statistical odds of making money. So I think it’s fair to say I was making a market more than running a strategy. You mentioned that backtesting system “would never be 100% accurate as we simply can’t predict everything”. I’m going to venture a guess that I WAS able to predict most of what you’re thinking I couldn’t. This was key to everything I did.

    It’s true I had indicators and I did not post them but I do not believe the indicators are the reason I was successful. There were a whole bunch of indicators that all gave only small predictive ability. Anyone could probably have come up with them. I’ll give you an example – if the NASDAQ just moved up – this predicted the Russell would also move up. Simple arbitrage right? Sure it’s a lot more sophisticated than that but I didn’t want to start writing formulas in my article.

    So in conclusion if you’ll look at the article from a little different perspective maybe you’ll find it interesting. I did try to post everything I thought was interesting about how I setup my system.

    I’m not that familiar with alphas/betas so not sure what to say about that. But I can say that I started off with only $10,000. I can also say I had a stop loss of $3000/day and never hit it. It was very non-risky.

    1. Regarding the backtesting and why I’m saying it can’t account for everything: By adding or removing liquidity (which you will be when simulating trades on historical data) it’s difficult and/or impossible to fully account for how that will affect the market. So merely by participating you’re changing things, which might dictate how other participants act, and so on. It might not be a real issue if you’re a small fish in a big ocean, how knows, right?

      When it comes to alpha/beta, it’s not that complex in it’s simplest form. You are basically trying to solve this equation:

      “returns on your strategy/portfolio” = alpha + beta * “returns on benchmark”

      Let’s say we’re only looking at one company, XYZ. Over four periods (time scale might be day, week, month, doesn’t really matter to the example), we’ve seen these returns:

      XYZ, period 1: +5%
      XYZ, period 2: +3%
      XYZ, period 3: -3%
      XYZ, period 4: +2%

      If you buy and hold one stock during all of those four periods, your alpha = 0, beta = 1, as we can easily see from the equation. If you instead of one stock buy and hold 10, your alpha is still zero, and your beta is now 10. Beta isn’t really that interesting, it tells us just how exposed/leveraged you are against the market, in this example, company XYZ.

      What is interesting however is if you could produce a positive alpha value. That’s typically not as easy. Say you had some insight telling you to close out on XYZ during all of period 3. This insight might be inside information, but let’s hope it’s not and you instead have a good strategy that were able to pick up on something which alerted you prior to period 3.

      Your returns would then be, assuming we only hold 1 stock:

      Strategy, period 1: +5%
      Strategy, period 2: +3%
      Strategy, period 3: 0%
      Strategy, period 4: +2%

      We would normally then solve for alpha and beta using linear regression. Several applications can help us here, with probably the most common being MATLAB or R. I also think Excel can solve this for you, but I haven’t looked into that. OxMetrics is also quite good for these things.

      So running this is MATLAB:

      y = [5; 3; 0; 2];
      x = [1 5; 1 3; 1 -3; 1 2];

      b = regress(y,x)

      b now contains alpha and beta, with alpha = 1.47 and beta = 0.59. Of course there are several things to remember here: We only have four periods, but would need a lot more for this to be of any use. Also, we should look at the significance of our alpha.

      Then again, people might say this equation is too simplistic to fully capture the risk picture as it’s purely linear. But we’ll leave that to the academics for now..

  2. I just stumbled upon this post. I actually have published details of a working HFT strategy at my blog,
    I hope to talk about a few more as well.

Comments are closed.