国产v亚洲v天堂无码久久无码_久久久久综合精品福利啪啪_美女扒开尿口让男人桶_国产福利第一视频在线播放_滨崎步无码AⅤ一区二区三区_三年片免费观看了_大屁股妇女流出白浆_泷川苏菲亚无码AV_我想看我想看一级男同乱伦_国产精品午夜福利免费视频,gogo国模全球大胆高清摄影图,2008门艳照全集视频,欧美午夜在线精品品亚洲AV中文无码乱人伦在线播放

Our paper got accepted by NIPS'16
來(lái)源: 吳華森/
加州大學(xué)戴維斯分校
1649
2
0
2016-08-13

Our paper "Double Thompson Sampling for Dueling Bandits" got accepted by NIPS'16, one of the top conferences in machine learning. 

In this paper, we propose a Double Thompson Sampling (D-TS) algorithm for dueling bandit problems. As indicated by its name, D-TS selects both the first and the second candidates according to Thompson Sampling. Specifically, D-TS maintains a posterior distribution for the preference matrix, and chooses the pair of arms for comparison by sampling twice from the posterior distribution. This simple algorithm applies to general Copeland dueling bandits, including Condorcet dueling bandits as its special case. For general Copeland dueling bandits, we show that D-TS achieves O(K^2 log T) regret. For Condorcet dueling bandits, we further simplify the D-TS algorithm and show that the simplified D-TS algorithm achieves O(Klog T + K^2 log log T) regret. Simulation results based on both synthetic and real-world data demonstrate the efficiency of the proposed D-TS algorithm.


A preliminary version can be found at https://arxiv.org/abs/1604.07101.


登錄用戶可以查看和發(fā)表評(píng)論,, 請(qǐng)前往  登錄 或  注冊(cè)
SCHOLAT.com 學(xué)者網(wǎng)
免責(zé)聲明 | 關(guān)于我們 | 聯(lián)系我們
聯(lián)系我們: