【论文推荐】了解《通信强化学习》必看的6篇论文（附打包下载地址）

阅读：评论：0

论文推荐

“SFFAI136期来自北京邮电大学的于会涵推荐的文章主要关注于深度强化学习的通信强化学习领域，你可以认真阅读讲者推荐的论文，来与讲者及同行线上交流哦。”

关注文章公众号

回复"SFFAI136"获取本主题精选论文

推荐理由：

Coefficients of selfish and altruistic strategy. They proposed to use deep reinforcement learning to determine the balancing coefficients of selfish and altruistic strategy in coordinated beamforming. The method of using balance coefficient to coordinate beamforming is novel.

MIMO configuration. The performance of the proposed scheme was simulated and evaluated by experiments with arguments regarding multiple input and multiple output (MIMO) configuration, shadow fading and state design options. This paper elaborates on the formulation of beamforming in MIMO configuration, which can inspire subsequent researchers.

推荐理由：

Distributed multi-agent algorithm. They proposed a distributed multi-agent double deep Q-learning network (DDQN) solution for beamfoming in mmWave MIMO networks. The proposed learning-based algorithm can achieve comparable performance with respect to exhaustive search while operating at much lower complexity.

Dynamic environment. In this system, users (UEs) move to different locations at each time, and may be served by different base stations (BSs) according to the adopted largest received power association criterion. The simulation results illustrate that the proposed distributed multi-agent DDQN solution adapts to UEs’ mobility.

推荐理由：

RIS. They investigated the joint design of transmit beamforming at the BS and phase shifts at the reflecting reconfigurable intelligent surface (RIS) to maximize the sum rate of multiuser downlink multiple input single output (MISO) systems utilizing DRL, assuming that direct transmissions between the BS and the users are totally blocked.

DDPG framework. They use policy-based deep deterministic policy gradient (DDPG) derived from Markov decision process to address continuous beamforming matrix and phase shifts. Since the transmit beamforming matrix and the phase shifts are continuous, the performance of DDPG is better, in contrast to designs addressing the discrete action space.

Low complexity. The proposed DRL based algorithm has a very standard formulation and low complexity in implementation, without knowledge of explicit model of wireless environment and specific mathematical formulations. Such that it is very easy to be scaled to various system settings.

推荐理由：

System model. The issue of MEC-empowered energy efficient resource allocation has not been well studied. They presented the system model for green mobile edge computing (MEC) scenarios and formulated the long-term average energy efficiency maximization problem.

PPO framework. They proposed a decentralized MADRL resource allocation algorithm for energy efficiency in green MEC. The algorithm adopted PPO framework which is novel in DRL. Compared to the three baseline methods (i.e., DQN, FP, and Random), their proposed algorithm achieved the best performance under different environment settings.

Robustness. They conducted extensive experiments to evaluate the effectiveness and robustness of their proposed algorithm.

推荐理由：

DRL for LEO satellite networks. A two-hops state-aware routing strategy based on deep reinforcement learning (DRL-THSA) was proposed for LEO satellite networks. In DRL-THSA, each node collect link state information within two-hop neighbors and makes routing decisions based on the information. The link state information is interacted between nodes through Hello packets; therefore, the DRL-THSA discover the node failure event in time and change the next hop node.

Novel link state. A setup and update method of link state was proposed. The link state is divided into three levels, and the traffic forwarding strategy for each level is presented, which allows DRL-THSA to cope with link congestion.

推荐理由：

DRL-based traffic engineering (TE). They present a comprehensive overview of DRL-based TE and the TE issues within the scope of the paper from three aspects, namely, routing optimization, congestion control, and resource management. Having distilled the major TE issues, they discuss the general procedure of formulating TE as a RL problem based on their analogy. Finally, they review fundamental DRL algorithms classified into three types, namely, value based, policy-based, and actor-critic. It is helpful for readers quickly to establish their understanding of DRL algorithm and the feasibility of DRL in TE issues.

Detailed literature review. They conduct a detailed literature review on applications of DRL in TE from three categories including routing optimization, congestion control, and resource management. The research works are introduced from basic DRL models to advanced DRL models, along with their comparison and relationship.

分享内容

讲者介绍

于会涵，北京邮电大学智能感知与计算教研中心研究生，主要研究方向为深度强化学习在通信网络中的应用。目前已在WCNC会议上发表论文。

分享题目

Deep Reinforcement Learning Based Beamforming for Throughput Maximization in Ultra-Dense Networks

分享摘要

Ultra-dense network (UDN) is a promising technology for 5G and beyond communication systems to meet the requirements of explosive data traffic. However, the dense distribution of wireless terminals potentially leads to severe interference and deteriorate network performance. To address this issue, beamforming is widely used to coordinate the interference in UDNs and improve receive gains by controlling the phase of multiple antennas. In this paper, we propose a multi-agent deep reinforcement learning (DRL) based beamforming algorithm to achieve more dynamic and fast beamforming adjustment. In the proposed algorithm, the agents inside beamforming controllers are distributively trained while exchanging partial channel state information (CSI) for better optimizing beamforming vectors to achieve maximized throughputs in UDNs. The evaluation results demonstrate that the proposed algorithm significantly improves the computation efficiency, as well as achieves the highest network throughput compared to several baselines.

题目：Deep Reinforcement Learning Based Beamforming for Throughput Maximization in Ultra-Dense Networks

论文下载：关注本公众号，对话框回复“SFFAI136”，获取下载

分享亮点

1. We proposed a muti-agent DRL-based algorithm to improve the beamforming computation efficiency drastically. Agents train their deep Q-networks (DQNs) and execute the actions distributedly. Each agent only needs to exchange a part of the global CSI during the entire training phase.

2. To improve the computation efficiency, discrete actions are used instead of continuous actions. In addition, we divided the beamformer into two parts, namely, the transmit power and the beam direction in order to minimize the interference.

3. We designed a specific reward function to make sure the transmission rates of primary users are larger than a given threshold, and avoid to generate too much interference to secondary users.

直播时间

2022年2月13日（周日）20:00—21:00 线上直播

关注本公众号，对话框回复“SFFAI136”，获取入群二维码

注：直播地址会分享在交流群内

SFFAI招募！

现代科学技术高度社会化，在科学理论与技术方法上更加趋向综合与统一，为了满足人工智能不同领域研究者相互交流、彼此启发的需求，我们发起了SFFAI这个公益活动。SFFAI每周举行一期线下活动，邀请一线科研人员分享、讨论人工智能各个领域的前沿思想和最新成果，使专注于各个细分领域的研究者开拓视野、触类旁通。

SFFAI目前主要关注机器学习、计算机视觉、自然语言处理等各个人工智能垂直领域及交叉领域的前沿进展，将对线下讨论的内容进行线上传播，使后来者少踩坑，也为讲者塑造个人影响力。SFFAI还在构建人工智能领域的知识森林—AI Knowledge Forest，通过汇总各位参与者贡献的领域知识，沉淀线下分享的前沿精华，使AI Knowledge Tree枝繁叶茂，为人工智能社区做出贡献，欢迎大家关注SFFAI论坛：。

论文推荐

《视频预测》论文推荐
《关系抽取专题》论文推荐
《对抗机器学习》论文推荐
《目标跟踪》论文推荐
《代码生成》论文推荐
《预训练模型》论文推荐
《机器翻译专题》论文推荐
《脚本事件预测专题》论文推荐
《多模态处理》论文推荐
《点云补全》论文推荐
《图像生成》论文推荐
《机器翻译》论文推荐
《点云补全》论文推荐
《文本表示》论文推荐
《对话机器阅读理解》论文推荐

【论文推荐】了解《通信强化学习》必看的6篇论文（附打包下载地址）

【论文推荐】了解《通信强化学习》必看的6篇论文（附打包下载地址）

《视频预测》论文推荐

《关系抽取专题》论文推荐

《对抗机器学习》论文推荐

《目标跟踪》论文推荐

《代码生成》论文推荐

《预训练模型》论文推荐

《机器翻译专题》论文推荐

《脚本事件预测专题》论文推荐

《多模态处理》论文推荐

《点云补全》论文推荐

《图像生成》论文推荐

《机器翻译》论文推荐

《点云补全》论文推荐

《文本表示》论文推荐

《对话机器阅读理解》论文推荐

更多论文推荐历史文章