Hadoop
权威指南》读书笔记之四 — Chapter 4
Yarn
是什么?YARN : yet another resource negotiator
Yarn
有什么作用?yarn 提供访问和请求集群资源的api,但是这些apis并不能被用户直接使用。相反,用户可以使用分布式的计算框架的高级APIs。这些计算框架建立在YARN上,并且隐藏了具体实现。
resource manager
and asks it to run an application master process
resourceManager
find a node manager
that can launch the application master
in a containercontainer
的过程和 执行Mapreduce
job 的过程有些相似】MapReduce
请求 container
的机制MapReduce
job有两个部分:map+reduce
。但是这两个步骤获取 container
的方式不同。
YARN
的调度策略Yarn中有三种调度策略
FIFO
shared clusters
指的是整个集群中的任务都使用共享的资源,这就导致如果一个作业需要运行很长时间,则后面的作业就需要等待很久才能运行,从而导致吞吐量很低。Capacity
Capacity Scheduler Configuration
synopsis
The Capacity Scheduler allows sharing of a Hadoop cluster along organizational lines,whereby each organization is allocated a certain capacity of the overall cluster.
Each organization is set up with a dedicated queue that is configured to use a given fraction of the cluster capacity.
每个组织被设置为一个专用队列,该队列配置为使用集群容量的给定部分
Queues may be further divided in hierarchical fashion, allowing each organization to share its cluster allowance between different groups of users within the organization
Within a queue, applications are scheduled using FIFO scheduling.
一个job 不会使用超过它的queue的容量大小;但是如果只有一个job被提交时,那么空闲的那个队列将会被使用。即使这个队列的容量超出。这个行为被称为 queue elasticity
.
但是queue elasticity
所带来的问题就是:经常有队列的容量被别人使用,但是一旦这个队列来了新任务时,又无法立即运行【必须得等到之前的任务运行完成,并释放资源】。
so if a queue is under capacity due to lack of demand, and then demand increases, the queue will only return to capacity as resources are released from other queues as containers complete.
It is possible to mitigate this by configuring queues with a maximum capacity so that they don’t eat into other queues’ capacities too much. This is at the cost of queue elasticity, of course, so a reasonable trade-off should be found by trial and error.
针对上述的这种情况:可以通过修改一些配置选项来减少吃掉太多内存,从而缓和这个问题: 这是以 queue elasticity
作为成本,当然,应该被反复尝试最佳阈值,并发现一个合理的交换。
下面给出了一个简单的示例,关于如何配置 queue
的大小:
<?xml version="1.0"?>
<configuration><property><name>yarn.queues</name><value>prod,dev</value></property><property><name>yarn.prod.capacity</name><value>40</value></property><property><name>yarn.dev.queues</name><value>eng,science</value></property> <property><name>yarn.dev.capacity</name><value>60</value></property><property><name>yarn.dev.maximum-capacity</name><value>75</value></property><property><name>yarn.apacity</name><value>50</value></property><property><name>yarn.dev.science.capacity</name><value>50</value></property>
</configuration>
该xml
匹配到的权限列表就是如下这个样子:
root
├── prod
└── dev├── eng└── science
<property><name>yarn.dev.maximum-capacity</name><value>75</value>
</property>
=> So that the dev queue does not use up all the cluster resources when the prod queue is idle,it has its maximum capacity set to 75%.In other words, the prod queue always has 25% of the cluster available for immediate use.
Note
Queue placement
The way that you specify which queue an application is placed in is specific to the application.
in MapReduce, you set the property mapreduce.job.queuename to the name of the queue you want to use.If the queue does not exist, then you’ll get an error at submission time.If no queue is specified, applications will be placed in a queue called default.
Fair Scheduler Configuration
The scheduler in use is determined by the setting sourcemanager.scheduler.class. The Capacity Scheduler is used by default.
<?xml version="1.0"?>
<allocations><defaultQueueSchedulingPolicy>fair</defaultQueueSchedulingPolicy><queue name="prod"><weight>40</weight><schedulingPolicy>fifo</schedulingPolicy></queue><queue name="dev"><weight>60</weight><queue name="eng" /><queue name="science" /></queue><queuePlacementPolicy><rule name="specified" create="false" /><rule name="primaryGroup" create="false" /><rule name="default" queue=" /></queuePlacementPolicy>
</allocations>
Queue placement
The Fair Scheduler uses a rules-based system to determine which queue an application is placed in.
Fair Scheduler
使用一个基于rules 系统区决定应用程序放在哪个queue 中。
<queuePlacementPolicy><rule name="specified" create="false" /><rule name="primaryGroup" create="false" /><rule name="default" queue=" />
</queuePlacementPolicy>
the queuePlacementPolicy element contains a list of rules, each of which is tried in turn until a match occurs.
这个queuePlacementPolicy
元素中包含一系列的规则,每个规则会被轮流尝试直到有一个rule 满足。
primaryGroup
rule 尝试将一个应用放到一个队列中,按照user的 primary Unix group;如果这个queue 不存在,不是去创建它【因为后面的create=“false”】 ,而是尝试下一个rule。<queuePlacementPolicy><rule name="specified" /><rule name="user" />
</queuePlacementPolicy>
Another simple queue placement policy is one where all applications are placed in the same (default) queue. This allows resources to be shared fairly between applications, rather than users. The definition is equivalent to this:
<queuePlacementPolicy><rule name="default" />
</queuePlacementPolicy>
Note that preemption reduces overall cluster efficiency, since the terminated containers need to be reexecuted.
Preemption is enabled globally by setting yarn.scheduler.fair.preemption to true.
Delay Scheduling
All the YARN schedulers try to honor locality requests.
On a busy cluster, if an application requests a particular node, there is a good chance that other containers are running on it at the time of the request.
However, it has been observed in practice that waiting a short time (no more than a few seconds) can dramatically increase the chances of being allocated a container on the requested node, and therefore increase the efficiency of the cluster.
等待短暂的时间(不超过几秒) 能够显著增加在请求的节点上分配容器的概率,同时这也增加集群的效率。
This feature is called delay scheduling, and it is supported by both the Capacity Scheduler and the Fair Scheduler.
注意上面的叙述:delay scheduling 支持Capacity Scheduler 和 Fair Scheduler。
NodeManager
给 resourceManager
发送的心跳信息有什么?Every node manager in a YARN cluster periodically sends a heartbeat request to the resource manager — by default, one per second.
Heartbeats carry information about the node manager’s running containers and the resources available for new containers, so each heartbeat is a potential scheduling opportunity for an application to run a container
For the Capacity Scheduler, delay scheduling is configured by setting yarn.de-locality-delay to a positive integer representing the number of scheduling opportunities that it is prepared to miss before loosening the node constraint to match any node in the same rack.
yarn.de-locality-delay 用这个去表示可以错过的分配次数
The Fair Scheduler also uses the number of scheduling opportunities to determine the delay, although it is expressed as a proportion of the cluster size. For example, setting yarn.scheduler.fair.de to 0.5 means that the scheduler should wait until half of the nodes in the cluster have presented scheduling opportunities before accepting another node in the same rack
Dominant Resource Fairness
abstract
标题的意思翻译过来就是: 占主导地位的资源公平
When there is only a single resource type being scheduled, such as memory, then the
concept of capacity or fairness is easy to determine. If two users are running applications,
you can measure the amount of memory that each is using to compare the two
applications.
However, when there are multiple resource types in play, things get more
complicated. If one user’s application requires lots of CPU but little memory and the
other’s requires little CPU and lots of memory, how are these two applications compared?
The way that the schedulers in YARN address this problem is to look at each user’s
dominant resource and use it as a measure of the cluster usage.This approach is called
Dominant Resource Fairness, or DRF for short.
example
Imagine a cluster with a total of 100 CPUs and 10 TB of memory. Application A requests
containers of (2 CPUs, 300 GB), and application B requests containers of (6 CPUs, 100
GB). A’s request is (2%, 3%) of the cluster, so memory is dominant since its proportion
(3%) is larger than CPU’s (2%). B’s request is (6%, 1%), so CPU is dominant. Since B’s
container requests are twice as big in the dominant resource (6% versus 3%), it will be
allocated half as many containers under fair sharing.
configuration
By default DRF is not used, so during resource calculations, only memory is considered
and CPU is ignored.
本文发布于:2024-02-01 03:39:53,感谢您对本站的认可!
本文链接:https://www.4u4v.net/it/170672999333592.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
留言与评论(共有 0 条评论) |