Mastering Hadoop 3
上QQ阅读APP看书,第一时间看更新

Configuring capacity scheduler 

The capacity scheduler allows sharing of resources across an organization, which enables multi tenancy and helps in increasing the utilization of a Hadoop cluster. Different departments in the organization have different cluster requirements and thus they require specific amounts of resources reserved for them when they submit their jobs. The reserved memory will be used by the users belonging to the department. If there are no other applications submitted to the queue, then resources will be available for other applications.

The first step to configure the capacity scheduler to set the scheduler class of the Resource Manager scheduler to capacity scheduler in YARN-site.xml is shown as follows: 

<property>
     <name>YARN.resourcemanager.scheduler.class</name>
     <value>org.apache.hadoop.YARN.server.resourcemanager.scheduler.            capacity.CapacityScheduler
</value> </property>

The queues properties are set in the capacity value in the scheduler.xml file. The queue allocation is defined as follows:

<?xml version="1.0"?>
<configuration>
<property>
<name>YARN.scheduler.capacity.root.queues</name>
<value>A,B</value>
</property>
<property>
<name>YARN.scheduler.capacity.root.B.queues</name>
<value>C,D</value>
</property>
<property>
<name>YARN.scheduler.capacity.root.A.capacity</name>
<value>60</value>
</property>
<property>
<name>YARN.scheduler.capacity.root.B.capacity</name>
<value>40</value>
</property>
<property>
<name>YARN.scheduler.capacity.root.B.maximum-capacity</name>
<value>75</value>
</property>
<property>
<name>YARN.scheduler.capacity.root.B.C.capacity</name>
<value>50</value>
</property>
<property>
<name>YARN.scheduler.capacity.root.B.D.capacity</name>
<value>50</value>
</property>
</configuration>

The capacity scheduler also provides a way to configure ACLs for queues and some advance configurations.