Wang's blog

基础 - 投资组合策略

Published on

简介

投资组合策略可以根据模型的预测分数生成投资组合。Qlib提供了一些已实现的投资组合策略,同时也支持用户根据需要进行自定义。在确定了模型(预测信号)与策略后,运行回测可以检查它们的性能。

基类

BaseStrategy

此类是所有策略类的基类,用户可以继承此类并实现其接口以添加自定义策略。其接口有:

  • generate_trade_decision:一个关键接口,用于在每个交易周期生成交易决策。其调用频率取决于执行器,但是实际交易频率可以由用户代码控制

WeightStrategyBase

此类只专注于目标仓位,并基于仓位自动生成订单。其接口有:

  • generate_target_weight_position
    • 根据当前仓位与交易日期生成目标仓位(不考虑现金)
    • 返回目标仓位(所有资产的百分比)

该类按照如下方式实现了generate_order_list接口:

  • 调用generate_target_weight_position方法生成目标仓位
  • 根据目标仓位生成股票的目标数量
  • 根据股票的目标数量生成订单

已实现的策略

TopkDropoutStrategy

该类按照如下方式实现了generate_order_list接口:

  • 执行Topk-Drop算法计算每支股票的目标数量
  • 根据股票的目标数量生成订单

EnhancedIndexingStrategy

该策略将主动管理与被动管理相结合,目标是在控制风险敞口的同时取得超出基准指数的收益。

例子

预测分数

预测分数是一个pandas的DataFrame对象,它的索引是<datetime(pd.Timestamp), instrument(str)>且必须包含分数列。一个例子如下:

  datetime instrument     score
2019-01-04   SH600000 -0.505488
2019-01-04   SZ002531 -0.320391
2019-01-04   SZ000999  0.583808
2019-01-04   SZ300569  0.819628
2019-01-04   SZ001696 -0.137140
             ...            ...
2019-04-30   SZ000996 -1.027618
2019-04-30   SH603127  0.225677
2019-04-30   SH603126  0.462443
2019-04-30   SH603133 -0.302460
2019-04-30   SZ300760 -0.126383

注意预测分数不一定是收益率,不同模型有不同的定义。

运行回测

  • 大多数情况下,用户可以使用backtest_daily回测他们的投资组合策略:
from pprint import pprint

import qlib
import pandas as pd
from qlib.utils.time import Freq
from qlib.utils import flatten_dict
from qlib.contrib.evaluate import backtest_daily
from qlib.contrib.evaluate import risk_analysis
from qlib.contrib.strategy import TopkDropoutStrategy

# 初始化
qlib.init(provider_uri=<qlib data dir>)

CSI300_BENCH = "SH000300"
STRATEGY_CONFIG = {
    "topk": 50,
    "n_drop": 5,
    # pred_score, pd.Series
    "signal": pred_score,
}


strategy_obj = TopkDropoutStrategy(**STRATEGY_CONFIG)
report_normal, positions_normal = backtest_daily(
    start_time="2017-01-01", end_time="2020-08-01", strategy=strategy_obj
)
analysis = dict()
# 默认频率为天
analysis["excess_return_without_cost"] = risk_analysis(report_normal["return"] - report_normal["bench"])
analysis["excess_return_with_cost"] = risk_analysis(report_normal["return"] - report_normal["bench"] - report_normal["cost"])

analysis_df = pd.concat(analysis)  # 类型: pd.DataFrame
pprint(analysis_df)
  • 如果用户需要更加精细地控制策略,可以参考下面的例子:
from pprint import pprint

import qlib
import pandas as pd
from qlib.utils.time import Freq
from qlib.utils import flatten_dict
from qlib.backtest import backtest, executor
from qlib.contrib.evaluate import risk_analysis
from qlib.contrib.strategy import TopkDropoutStrategy

# 初始化qlib
qlib.init(provider_uri=<qlib data dir>)

CSI300_BENCH = "SH000300"
# `benchmark`用于计算策略的超额收益率
# 其数据格式与普通指标相同
# 例如,可以使用如下代码查询其数据:
# `D.features(["SH000300"], ["$close"], start_time='2010-01-01', end_time='2017-12-31', freq='day')`
# 与之不同,`market`代表一组股票(例如csi300)
# 例如,可以使用如下代码查询一个股票市场的所有数据:
# ` D.features(D.instruments(market='csi300'), ["$close"], start_time='2010-01-01', end_time='2017-12-31', freq='day')`

FREQ = "day"
STRATEGY_CONFIG = {
    "topk": 50,
    "n_drop": 5,
    # pred_score, pd.Series
    "signal": pred_score,
}

EXECUTOR_CONFIG = {
    "time_per_step": "day",
    "generate_portfolio_metrics": True,
}

backtest_config = {
    "start_time": "2017-01-01",
    "end_time": "2020-08-01",
    "account": 100000000,
    "benchmark": CSI300_BENCH,
    "exchange_kwargs": {
        "freq": FREQ,
        "limit_threshold": 0.095,
        "deal_price": "close",
        "open_cost": 0.0005,
        "close_cost": 0.0015,
        "min_cost": 5,
    },
}

# 策略对象
strategy_obj = TopkDropoutStrategy(**STRATEGY_CONFIG)
# 执行器对象
executor_obj = executor.SimulatorExecutor(**EXECUTOR_CONFIG)
# 回测
portfolio_metric_dict, indicator_dict = backtest(executor=executor_obj, strategy=strategy_obj, **backtest_config)
analysis_freq = "{0}{1}".format(*Freq.parse(FREQ))
# 回测信息
report_normal, positions_normal = portfolio_metric_dict.get(analysis_freq)

# 分析
analysis = dict()
analysis["excess_return_without_cost"] = risk_analysis(
    report_normal["return"] - report_normal["bench"], freq=analysis_freq
)
analysis["excess_return_with_cost"] = risk_analysis(
    report_normal["return"] - report_normal["bench"] - report_normal["cost"], freq=analysis_freq
)

analysis_df = pd.concat(analysis)  # type: pd.DataFrame
# 记录指标
analysis_dict = flatten_dict(analysis_df["risk"].unstack().T.to_dict())
# 打印结果
pprint(f"The following are analysis results of benchmark return({analysis_freq}).")
pprint(risk_analysis(report_normal["bench"], freq=analysis_freq))
pprint(f"The following are analysis results of the excess return without cost({analysis_freq}).")
pprint(analysis["excess_return_without_cost"])
pprint(f"The following are analysis results of the excess return with cost({analysis_freq}).")
pprint(analysis["excess_return_with_cost"])

结果

回测结果形式如下:

                                                  risk
excess_return_without_cost mean               0.000605
                           std                0.005481
                           annualized_return  0.152373
                           information_ratio  1.751319
                           max_drawdown      -0.059055
excess_return_with_cost    mean               0.000410
                           std                0.005478
                           annualized_return  0.103265
                           information_ratio  1.187411
                           max_drawdown      -0.075024

字段说明:

  • excess_return_without_cost:不考虑成本时的超额收益率
    • mean:均值
    • std:方差
    • annualized_return:年化收益率
    • information_ratio:信息比率
    • max_drawdown:最大回撤
  • excess_return_with_cost:考虑成本时的超额收益率
    • mean:均值
    • std:方差
    • annualized_return:年化收益率
    • information_ratio:信息比率
    • max_drawdown:最大回撤