Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Any high frequency trading hackers
114 points by astroguy on Nov 28, 2010 | hide | past | favorite | 58 comments
What are the current challenges in high frequency trading?

Can anyone suggest me few tasks [like implement foo algorithm], so that I can directly jump into those tasks during my free time




There are a lot of problems with HFT (high-frequency trading) that basically boil down to proximity to the exchange. If you haven't paid for a colo inside their datacenters you can't well expect to use the usual HFT tricks of putting in requests for things you don't want and then never making actions on them because your 40ms latency to the exchange will mean that the guys in the colo have acted on your move. I heard someone say that they got a colo for 10,000 a month (sorry, I don't have a source for that) so that kind of edges you out of really good HFT. Another thing to know is that you are running around like a chicken with your head cut off trying to grab pennies off a railroad track that is running bullet trains (I love that analogy). i.e. very dangerous. Slight mistakes can cost you hundreds of dollars ever 20ms until you hit ctrl+C in your script! I do wish you the best of luck and I hope you will write about your progress/experience in HFT on HN in the future.


Thanks for alerting me! I am a newbie to this field, but love to explore the pros and cons of the current algorithms used in HFT. Sure! I will.


HFT is, as the name suggests, all about speed - sub-millisecond latency in some markets. It requires a lot of physical and expensive resources, not to mention an extremely deep knowledge of the mechanics of whatever contracts you wish to trade and the exchanges they are traded on. Basically there's a reason this field is played only by the big banks and hedge funds. (I bank I know spent at least $10m setting up an HFT desk)

Algo trading is probably a much better option: basically trading off the back of quant/stat analysis you have done with respect to prices (or relative prices). You'll learn lots about whatever contracts/instruments/markets you're interested in, plus get to flex your geeky skills. And you can do it from a laptop at home over a regular intenet connection with some cheap, if not free, trading platform or api.


Is it actually still possible (with that I mean realistic) to make a profit that way? You'd say the banks have seized all algorithmic trading oppertunities, if you try something like that at home you'll always be second violin.


I've been working on an algorithmic trading system using machine learning, it is not HFT currently. It is currently daily (24h+ held equities), the intra-day side (~5-60 minute held equities) will come very quick after I feel comfortable with the machine learning side of things. The source of data will change, and a few tweaks to the actual trading system and it will be running intra-day. The HFT will only come around once I can get a small colo that can achieve the necessary <40ms transactions to get the benefit of the pre-window before orders actually hit the open market.

http://edwardworthington.com/

Sorry the interface was thrown together over a weekend (The actual back-end application was the primary focus for the last year as it was just me looking at it via command line) and quickly designed it with a large AJAX load at the beginning, I'll eventually change it to a static load then do ajax polling to update the data.

I cannot recommend any particular reading sources as I've been working with my financial buddies, that have been feeding me tips and doing my own discovery on the internet.

This was just a side project of mine but has turned into a really nice application. It is always calculating the better strategies (out of over 50 possible different methods/functions with variables ranging from 0-260 that are used to indicate open and close signals in any number of combinations). It has improved its strategy over the last week taking it from estimating ~60% to ~70% gains YTD. I have no doubt it will eventually get over 100%.

I'd love to collaborate with anyone wanting to get into this stuff as I'm flying solo.


Where do you get the data? I've found that access to (inexpensive) sources of data to be a problem. For backtesting, I'd love to get historical data; even a sample would do. A long time ago, Island used to make their data available. Then they were bought out by NASDAQ, and no more data :-(


I've been collecting data from various places, originally while I was building the framework I simply downloaded yahoo daily data to test against.

Now I've been downloading and recording tick changes from my broker, Optionshouse.com.


It looks very interesting! If you need/want help with interface design and front-end of the application, shoot me an e-mail (my username at gmail).


Awesome! I am more than interested to step into. Could you please shoot me an email to aliengeek4u at gmail dot com


I will shoot you an email in the morning.


I used to trade Forex (had a small startup). I'm looking to get back into the game. Currently working on a few algorithms that I'd like to test out, soon. I'd love to bounce some ideas around with you. Email is in my profile!


I've been tinkering with Forex at eToro. I can see ALOT of potential there, I would also like to chat about it with you as well. I will contact you shortly.


Waiting for you email!


I apologize, I've been in the middle of a move.

I posted my contact info @ HNofficeHours : http://hnofficehours.com/profile/a904guy/


can you go more into the machine learning parts? i love such uses of ML, so futuristic.. im guessing you use either some kind of genetic algorithm or neural networks as is most common? or something else entirely?


Absolutely

So the system is built around the ~50 different indicators and oscillators that I've found the formulas for generating.

These "methods" are variable driven, meaning each method could have 1 to 4 variables ranging between 0.0001 to 260.

The current testing platform that it self manages has the end goal to find the highest gain after commissions using a combination of any of these methods for opens and closes.

Considering the __massive__ amount of possible combinations of these methods, the system has two testing suites it runs, the first round of testing is a simple test suite that only sums the gain it would get using those methods over the last YTD. This doesn't mean that the highest here will be the best combination in the end as commissions will bite you if you don't keep an eye on them. We simply only collect sum of gains because of the MASSIVE amount of tests we have to run. This cuts down on days/weeks/months of computation time.

So it will increment through the blocks (two sets of combinations, one for open, one for close) of functions typically by 1000, so it will start testing at #1, then skip 1000 of them, test #1001, rinse and repeat till its ran through the whole list (this is broken out across 5 machines around my condo, and multi threaded to utilize every core minus 1 of each machine, I have roughly 40 cores working on this system).

Once it has identified the best ~25% of raw gain increment points, it will then start incrementing forward and backward around those starting points as long as the raw gain is increasing and not showing a change in direction (I use a mixture of a momentum formula and P SAR to determine the change in direction as you will get some noise and a quick change could make you miss gold on the other side because of a slight fluctuation). So while all these are being ran, the results get funneled back into the job queue for the second half of the testing suite.

This is also distributed across all the machines.

The second half then runs the full test suite on each combination of methods to recreate the market verbatim for the last YTD and determine everything, gains, share quantity, p/l, commissions spent, highest equity, ect.

Finally all this data is put back into the job queue once again to be sorted to find the highest net gains after commissions.

The winning methods are finally stored in the datastore to be accessed by the actual stock trading platform that will use this data during trading hours to execute trades accordingly.

So from start to finish the application handles and manages, what tests it wants to run, can determine the best strategy to use and execute the strategies on its own.

I tried to keep it very basic explanation, there is a lot of other things that go on, to make this all work flawlessly. I do want to thank 0MQ and Gearman for playing vital roles in work distribution and message queuing amongst the worker threads.

I do highly recommend that if anyone every wants to truly learn how to scale an application to build a algo trading system on limited hardware and try to squeeze EVERY millisecond you can out of it. (I've rebuilt it from the ground up numerous times).

Actually typing this out makes me see some similarities to a map and reduce method in some ways as well.


Awesome. It sounds like you are running some kind of stochastic optimization algorithm on the data points and also doing so across a cluster! Are you eventually gonna leverage GPUs on the machines as well? Also, do you have a blog or something?

>I tried to keep it very basic explanation, there is a lot of other things that go on, to make this all work flawlessly.

nope, was more than enough. Don't want you to give away your secret sauce :D. Is the system in production yet?


I have been tinkering with pyCUDA and openCL. I will have to convert all my algorithms to kernel code for CUDA but it shows VERY promising results. I can see the application being run across GPUs to keep costs low. I have a handful of nice nVidia cards with around ~200 cores each.

I tried to write up a little about Edward and his backend process a while back on my personal website, but I find that it changes too much currently to keep anything on paper. I code more than I can write about the code ;]

The system is being ran against other virtual systems now. Everything has been trending almost exactly along my estimates so I'm very excite to get it started. The main testing grounds is with my Optionshouse.com, which their interface is nice. All interactions are XML requests, so I've been able to easily write an API that is VERY solid, to do everything I need.

I'm moving the system this week so that I can get it into more high availability setup. After that I would begin working it against my personal IRA account to see how it goes on the long side at least.


Also interested. I have wanted to try GAs with this same general approach for awhile but I have a trading background not a coding one. logicwins atthe gmail dot com.


I'll shoot you an email today.


Hey I'm interested in this and have tried to code some stuff in C# for options trading and using candlestick patterns, support/resistance, etc. I'm now sort of looking into options pinning on expiration and exploiting that - just need to find a good cheap source for options data. Anyhow can you send me an email - maybe we can help each other out? carbtrader @ gmail dot com


Hadn't started implementing any systems yet, but earlier this year I ordered these and they've been ultra-educational at least for figuring out how to get started, vocabulary, etc.:

High-Frequency Trading: A Practical Guide to Algorithmic Strategies and Trading Systems http://www.amazon.com/gp/product/0470563761

Inside the Black Box: The Simple Truth About Quantitative Trading http://www.amazon.com/gp/product/0470432063

Quantitative Trading: How to Build Your Own Algorithmic Trading Business http://www.amazon.com/gp/product/0470284889


For what it's worth....

I really liked Inside the Black Box.

I didn't find I learned or enjoyed High-Frequency trading that much though.



You won't find a better place to ask than: http://www.wilmott.com/index.cfm?&forumid=1


There is a fledgling Hacker-News-for-quants at http://quant.ly.

The site is still building critical mass, but most of the current users are experienced quants & HF traders. (Disclaimer: I launched the site)


I nearly launched quantly.com a year ago. Almost bought the .ly but held off on it. Surprising to see someone using the domain here.


Great site. OP might find http://collective2.com interesting as well. Lots of medium-frequency algorithms that might be more accessible if you can't pay big bucks to colo.


Thanks for sharing. Have you subscribed to any strategies?


No, I was playing around with creating some basic strategies a while back using MetaTrader4 (http://www.metatrader4.com/) to see how it works when I discovered that site. Never got around to trying other peoples'.


I'd suggest maybe look at algo trading rather than HF, it's much more accesible to outsiders, plus you can use the algorithms on places like betfair and stand a decent chance of actually making money.


Do you have any suggestions on where to start reading on Algorithmic Trading? Perhaps also a way that I could test algorithms against the market, without actually risking any money?


The Encyclopedia of Trading Strategies by Jeffery Owen is good. My own experience suggests it is extremely difficult to find exploitable inefficiencies by looking at market data alone [I have not managed to do so and have spent a lot of time attempting to do so] and you're more likely to succeed making something useful people want!


You can buy historical 'tick data'. Be forewarned, however, that simulations run on this kind of data are not the same as real trading since it doesn't reflect the bid/ask. Also in my opinion and experience, price alone is insufficient data for analysis. Also don't forget to figure in total execution cost as it makes a huge difference in the evaluation of algorithms, not to mention the 'bank roll' necessary to allow any 'edge' to play out. Trading simulations are an engaging software problem, but they aren't such a great approximation of actual trading, at least in my experience.


You can also download daily tick data from Yahoo Finance for free, They offer bid/ask, volume, and adjusted closes.


"Tick data" refers to trade by trade execution data. Daily data is end of day summary data. If you are doing HFT, you need tick data and the market depth. Huge volumes of data compared to daily close data.


Aye, Tick data is needed for HFT, no doubt there. I was actually talking towards this thread which mentioned that they should try algorithmic trading first. Which you can build a successful system around daily closes, as long as you plan to hold your equity for over a 24hr period.


For serious HFT development you need to have order book data too. This costs.


You can easily test in virtual markets.

Optionshouse has a nice one that has a XML ajax interface.

VSE from marketwatch is a bit older but can also be done.

I've wrote APIs for both.


what parts of betfair are you suggesting? I am skeptical but intrigued. while profitable gambling is possible I expect the potential returns are much less for respective successful strategies with advantaged gambling also less amenable to extracting any sort of substansively edge giving pattern.

edit: i'm refering to algorithmic trading v gambling not HFT.


Betfair gives you access to the order book and you can get historical data relatively cheaply. Pretty much any liquid market on Betfair will do (in-play sports events tend to be good). Betfair markets to tend to show a lot of the same patterns as "real markets', although obviously there are differences. But it's a decent place to learn the fundamentals.

You're right that potential returns are much higher on real markets, purely because there's far more money traded on real markets. On the other hand entry to "real markets" is much more expensive, Betfair is fairly cheap to enter and don't charge a per-trade commission. Plus there are more market inefficiency on Betfair that don't exist in more developed markets.


If it wouldn't trouble you to do so, could you go into more detail please? Do you have experience with this?

You are saying that any sport event, in particular those that are in play actually have some statistical mineable property with profitably exploitable strategies? If this is the case, with such a low barrier should this not be a game with zero expectation in the long run (maybe negative if there are minimum bet rules)? Whatever distribution that can be learned would have much higher variance than your typical intra day trades no? And while these techniques may be similar in spirit, do the fundamentals map sufficiently? As they say, the devil is in the details. but I speak from ignorance.


I've messed around a little bit with it, but I know some of my quant friends do it more seriously. There's definitely ways of making money of betfair, you even occasionally have arbitrage opportunities across the different markets on betfair.

I think the fundamentals are close enough that it's a useful place to learn. You're not going to be able to take a method that works on Betfair and apply it to a real market, but a lot of the statistical mining/predictions techniques you can use of betfair are exactly the same as those used in real markets.


Betfair isn't just for gambling. You can trade betting positions. In that sense, it's a marketplace amenable to algorithmic trading like any other, entirely without relying on having a gambling advantage.


In the big picture view it seems that HFT is becoming a crowded trade. I would think the competition would be a huge challenge. Maybe you should try to start off in foreign markets where there is some breathing room - if that's possible.

Here is an article about about wall street programmers leaving the big boys to go at it alone.

http://www.forbes.com/2010/07/28/high-frequency-trading-pers...


Challenges:

* The need for speed, at every level of the architecture (network, tcp/ip, hardware, app)

* Reducing order (send/ack) round-trip times, this generally means putting your servers in a data cetre as close to the exchange as possible (co-located, if the exchange offers it). If your trading across multiple exchanges simultaneously it gets trickier.

* Sourcing market data - can you source direct from the exchange, rather than through a 3rd party like Reuters? Again, it comes down to how fast you can re-act to the market.

* Back-testing - you need historical data to test a model, then you need a way of testing the model - what are you going to test against? How are you going to simulate the exchange?

* Expense - it's expensive - market data, co-location etc etc all costs, as others suggested. HFT is generally short term positions, with some arbitrage strategies holding positions for less than a few milliseconds. A medium term (intra-day) type strategy requires less intensive (expensive) technology as your not trading to capture market prices that might only be extant for a few milliseconds.


Light speed seems to be a pretty important problem - the added benefit being if you crack that nut, further employment will be unnecessary.


Yeah, but causal loops would be a bitch.


Oh, well, what's one more type of pollution? I think a world with causal pollution would be a more interesting one.


Ah, quantum teleportation is the promising one to overcome current propagation delay.


what do you mean? transmission of actual information from quantum entanglement based results are still bounded by classical channels.


What value does HFT create?


HFT provides liquidity which reduces everyone else's cost and/or exposure to short term risk.

Low-latency trading though costs everyone. Nobody actually needs anything faster than a fill in a blink of an eye. The only people who think they do really need smarter match engines or to stop taking advantage of people


Where does short-term risk come from? And what's the difference between HFT and LLT?

If anyone knows of good high-level reading material on the subject, I'd appreciate a link so I can spare you more dumb questions.

I admit to having drunk some of Mark Cuban's Kool-Aid, as well as Jon Stewart's and that of some other liberal sources. Their argument, as far as I understand, is that HFT doesn't provide value proportionate with that which it extracts from a system designed to connect investors with entrepreneurs.

The idea has an appealing simplicity, especially given recent history, but don't know very much about markets, so I thought I'd ask people on the other side of the debate for their take.


That's a very important question to ask and I myself would like to get a response from some of the readers who're hacking away at it, instead of having you get downvoted.

How does one prevent a catastrophy that may result from "rogue algorithms" that are far worse than Infinium's? And is such risk truly worth it?


Maxeler Technologies (http://www.maxeler.com)supplies turnkey FPGA-based acceleration that supports high speed trading with trading latencies on the order of a few microseconds. Software development for the accelerated platform begins with a client's "known to work" proprietary code, which Maxeler accelerates. When latency is an important performance factor, the Maxeler trading server needs to be collocated in the exchange.


Get in touch with savvis to do a colo at NJ2.


One of the best book I read about electronic trading is Trading and Exchanges: Market Microstructure for Practitioners.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: