Bachelor Thesis Project
IIT Bombay
I am currently also working with Professor Jayakrishnan Nair (IITB) and a colleague of mine in the area of pure exploration in multi-armed bandits, under the fixed budget setting. We wish to formulate and propose an adaptive any-time algorithm that builds on the classical Successive Rejects algorithm. In particular, we are analyzing the decay rate of the probability of the error as the budget for the task grows. We hope that our algorithm asymptotically comes within a universal constant of the information-theoretic lower bound on this decay rate, as opposed to the Successive Rejects algorithm which comes within a logarithmic factor of this lower bound. We have already coded up a rudimentary implementation of our proposed algorithm and have obtained some promising results regarding the decay rate. Currently, we are working on proving the theoretical guarantees that we expect our algorithm to