Online Learning Challenge

The Online Leaning Challenge is about finding the optimal way for mixing a number of signals to form a strategy that performs best over unseen data. Being a purely machine learning/statistical problem, this does not require any experience in finance.

The challenge here is to find the best way to combine different signals to form one strategy that performs well on unseen data. Say you have three people guiding you on how to invest your money. Each signal is the series of daily returns had you followed the advice of that one person only. However, you want to combine these signals in a manner that maximizes your overall return. And remember you are building an online learning system.

Please email solutions to data-science-challenge (AT) tworoads (dot) co (dot) in with code used to find the solution.

 

Objective:
You are provided with a file with dates and log returns for a number of signals on those dates.  The signals will be combined linearly to form a strategy i.e. your strategy = (weight_1) * (signal_1) + (weight_2) * (signal_2) + … + (weight_n) * (signal_n). Your objective is to find an optimal(one that maximizes net log returns) combining method that only uses past data. The weights you give on on day i should be based on only data till day i-1 such that the log returns on day i should be log ( Sum ( w(i,j) * exp ( logret(i,j) ) ) ). You have to write a C++ / Python / Cython / Julia code that tries to maximize net log returns on a test dataset. Please ensure that the code is general in the sense that it can easily run for any number of signals and days in the test dataset.

 
Submission:
You are expected to implement eval function in the given script.  Submit your modified version of eval.py and other files that we might need to run this script. If you are using a language other than Python, feel free to translate the eval.py script. In that case, also provide the command to build and run your submission.The command ‘python eval.py path_to_input_file‘ should output the results.

 


Scoring:
The solution will be evaluated on a test set which is different from the training set. On the test data, we will run a strategy that starts with a certain amount of money, and based on the weights you assign, each day for each signal it allocates the specified fraction of the portfolio to different products. We assume daily rebalancing to specified weights.The scoring will be purely on the net log returns on test-data, however there will be extra points for risk-adjusted return. Brownie points for efficient run time.
 
FAQ:
Q: Which programming languages are allowed for this ?
A: You can use anything that is freely available and all packages that are freely and easily available. ( If you ask us, we’d probably work in Python, Julia,  
C++
 
Q: Are there any constraints on the weights?
A: The weights must be positive and their sum must be equal to 1
.

Q: What is the format of the output weight file ?
A: We expect n comma separated weights for every date in the input file where n is the number of signals to be combined. The weights for every date should be on different line

Q: What is the rebalancing frequency ?
A: We assume daily rebalancing.

Q: Is any transaction cost involved in rebalancing?
A: For simplicity, we assume zero transaction cost
 
Q: Can I submit more than one entry ?
A: A participant is allowed to enter at most two solutions. We will post the best of two solutions.