From: "Mike Pagano" <mpagano@gentoo.org>
To: gentoo-commits@lists.gentoo.org
Subject: [gentoo-commits] proj/linux-patches:3.14 commit in: /
Date: Fri, 9 Jan 2015 16:18:50 +0000 (UTC) [thread overview]
Message-ID: <1420820324.0b8b3489401caaf8a95a16b8636504a3179aa437.mpagano@gentoo> (raw)
commit: 0b8b3489401caaf8a95a16b8636504a3179aa437
Author: Mike Pagano <mpagano <AT> gentoo <DOT> org>
AuthorDate: Fri Jan 9 16:18:44 2015 +0000
Commit: Mike Pagano <mpagano <AT> gentoo <DOT> org>
CommitDate: Fri Jan 9 16:18:44 2015 +0000
URL: http://sources.gentoo.org/gitweb/?p=proj/linux-patches.git;a=commit;h=0b8b3489
BFQ upgrade to v7r7
---
0000_README | 12 +-
...oups-kconfig-build-bits-for-BFQ-v7r7-3.14.patch | 6 +-
...ntroduce-the-BFQ-v7r7-I-O-sched-for-3.14.patch1 | 2023 ++++++++++++++------
...ly-Queue-Merge-EQM-to-BFQ-v7r7-for-3.14.0.patch | 530 +++--
4 files changed, 1829 insertions(+), 742 deletions(-)
diff --git a/0000_README b/0000_README
index 9766ed5..0f5f3a0 100644
--- a/0000_README
+++ b/0000_README
@@ -190,15 +190,15 @@ Patch: 5000_enable-additional-cpu-optimizations-for-gcc.patch
From: https://github.com/graysky2/kernel_gcc_patch/
Desc: Kernel patch enables gcc optimizations for additional CPUs.
-Patch: 5001_BFQ-1-block-cgroups-kconfig-build-bits-for-v7r2-3.14.patch
+Patch: 5001_BFQ-1-block-cgroups-kconfig-build-bits-for-v7r7-3.14.patch
From: http://algo.ing.unimo.it/people/paolo/disk_sched/
-Desc: BFQ v7r2 patch 1 for 3.14: Build, cgroups and kconfig bits
+Desc: BFQ v7r7 patch 1 for 3.14: Build, cgroups and kconfig bits
-Patch: 5002_BFQ-2-block-introduce-the-v7r2-I-O-sched-for-3.14.patch1
+Patch: 5002_BFQ-2-block-introduce-the-v7r7-I-O-sched-for-3.14.patch1
From: http://algo.ing.unimo.it/people/paolo/disk_sched/
-Desc: BFQ v7r2 patch 2 for 3.14: BFQ Scheduler
+Desc: BFQ v7r7 patch 2 for 3.14: BFQ Scheduler
-Patch: 5003_BFQ-3-block-add-Early-Queue-Merge-EQM-v7r2-for-3.14.0.patch
+Patch: 5003_BFQ-3-block-add-Early-Queue-Merge-EQM-v7r7-for-3.14.0.patch
From: http://algo.ing.unimo.it/people/paolo/disk_sched/
-Desc: BFQ v7r2 patch 3 for 3.14: Early Queue Merge (EQM)
+Desc: BFQ v7r7 patch 3 for 3.14: Early Queue Merge (EQM)
diff --git a/5001_BFQ-1-block-cgroups-kconfig-build-bits-for-BFQ-v7r2-3.14.patch b/5001_BFQ-1-block-cgroups-kconfig-build-bits-for-BFQ-v7r7-3.14.patch
similarity index 97%
rename from 5001_BFQ-1-block-cgroups-kconfig-build-bits-for-BFQ-v7r2-3.14.patch
rename to 5001_BFQ-1-block-cgroups-kconfig-build-bits-for-BFQ-v7r7-3.14.patch
index 2572376..064b032 100644
--- a/5001_BFQ-1-block-cgroups-kconfig-build-bits-for-BFQ-v7r2-3.14.patch
+++ b/5001_BFQ-1-block-cgroups-kconfig-build-bits-for-BFQ-v7r7-3.14.patch
@@ -1,7 +1,7 @@
-From c3280db98437c9520f04ecacfdf1a868d7a4b7b3 Mon Sep 17 00:00:00 2001
+From 5a63dcf91cb3c8f1550016ab34d4aaa6ebbb82fb Mon Sep 17 00:00:00 2001
From: Paolo Valente <paolo.valente@unimore.it>
Date: Tue, 3 Sep 2013 16:50:42 +0200
-Subject: [PATCH 1/3] block: cgroups, kconfig, build bits for BFQ-v7r2-3.14
+Subject: [PATCH 1/3] block: cgroups, kconfig, build bits for BFQ-v7r7-3.14
Update Kconfig.iosched and do the related Makefile changes to include
kernel configuration options for BFQ. Also add the bfqio controller
@@ -100,5 +100,5 @@ index 7b99d71..4e8c0ff 100644
SUBSYS(perf)
#endif
--
-1.9.0
+2.1.3
diff --git a/5002_BFQ-2-block-introduce-the-BFQ-v7r2-I-O-sched-for-3.14.patch1 b/5002_BFQ-2-block-introduce-the-BFQ-v7r7-I-O-sched-for-3.14.patch1
similarity index 72%
rename from 5002_BFQ-2-block-introduce-the-BFQ-v7r2-I-O-sched-for-3.14.patch1
rename to 5002_BFQ-2-block-introduce-the-BFQ-v7r7-I-O-sched-for-3.14.patch1
index 133602c..65374b9 100644
--- a/5002_BFQ-2-block-introduce-the-BFQ-v7r2-I-O-sched-for-3.14.patch1
+++ b/5002_BFQ-2-block-introduce-the-BFQ-v7r7-I-O-sched-for-3.14.patch1
@@ -1,9 +1,9 @@
-From 5055277df59d9280da6b60cf90bed8e5e57dc44d Mon Sep 17 00:00:00 2001
+From ba938955e241de008540551e3a16e43850ecdd66 Mon Sep 17 00:00:00 2001
From: Paolo Valente <paolo.valente@unimore.it>
Date: Thu, 9 May 2013 19:10:02 +0200
-Subject: [PATCH 2/3] block: introduce the BFQ-v7r2 I/O sched for 3.14
+Subject: [PATCH 2/3] block: introduce the BFQ-v7r7 I/O sched for 3.14
-Add the BFQ-v7r2 I/O scheduler to 3.14.
+Add the BFQ-v7r7 I/O scheduler to 3.14.
The general structure is borrowed from CFQ, as much of the code for
handling I/O contexts. Over time, several useful features have been
ported from CFQ as well (details in the changelog in README.BFQ). A
@@ -56,12 +56,12 @@ until it expires.
Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
Signed-off-by: Arianna Avanzini <avanzini.arianna@gmail.com>
---
- block/bfq-cgroup.c | 926 +++++++++++++++
+ block/bfq-cgroup.c | 938 +++++++++++++
block/bfq-ioc.c | 36 +
- block/bfq-iosched.c | 3300 +++++++++++++++++++++++++++++++++++++++++++++++++++
- block/bfq-sched.c | 1078 +++++++++++++++++
- block/bfq.h | 622 ++++++++++
- 5 files changed, 5962 insertions(+)
+ block/bfq-iosched.c | 3902 +++++++++++++++++++++++++++++++++++++++++++++++++++
+ block/bfq-sched.c | 1214 ++++++++++++++++
+ block/bfq.h | 775 ++++++++++
+ 5 files changed, 6865 insertions(+)
create mode 100644 block/bfq-cgroup.c
create mode 100644 block/bfq-ioc.c
create mode 100644 block/bfq-iosched.c
@@ -70,10 +70,10 @@ Signed-off-by: Arianna Avanzini <avanzini.arianna@gmail.com>
diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
new file mode 100644
-index 0000000..bcecdb4
+index 0000000..798bb17
--- /dev/null
+++ b/block/bfq-cgroup.c
-@@ -0,0 +1,926 @@
+@@ -0,0 +1,938 @@
+/*
+ * BFQ: CGROUPS support.
+ *
@@ -85,7 +85,8 @@ index 0000000..bcecdb4
+ *
+ * Copyright (C) 2010 Paolo Valente <paolo.valente@unimore.it>
+ *
-+ * Licensed under the GPL-2 as detailed in the accompanying COPYING.BFQ file.
++ * Licensed under the GPL-2 as detailed in the accompanying COPYING.BFQ
++ * file.
+ */
+
+#ifdef CONFIG_CGROUP_BFQIO
@@ -153,6 +154,12 @@ index 0000000..bcecdb4
+ entity->new_weight = bfq_ioprio_to_weight(bgrp->ioprio);
+ entity->new_ioprio = bgrp->ioprio;
+ } else {
++ if (bgrp->weight < BFQ_MIN_WEIGHT ||
++ bgrp->weight > BFQ_MAX_WEIGHT) {
++ printk(KERN_CRIT "bfq_group_init_entity: "
++ "bgrp->weight %d\n", bgrp->weight);
++ BUG();
++ }
+ entity->new_weight = bgrp->weight;
+ entity->new_ioprio = bfq_weight_to_ioprio(bgrp->weight);
+ }
@@ -160,6 +167,7 @@ index 0000000..bcecdb4
+ entity->ioprio = entity->new_ioprio;
+ entity->ioprio_class = entity->new_ioprio_class = bgrp->ioprio_class;
+ entity->my_sched_data = &bfqg->sched_data;
++ bfqg->active_entities = 0;
+}
+
+static inline void bfq_group_set_parent(struct bfq_group *bfqg,
@@ -217,8 +225,9 @@ index 0000000..bcecdb4
+ bfq_group_set_parent(prev, bfqg);
+ /*
+ * Build a list of allocated nodes using the bfqd
-+ * filed, that is still unused and will be initialized
-+ * only after the node will be connected.
++ * filed, that is still unused and will be
++ * initialized only after the node will be
++ * connected.
+ */
+ prev->bfqd = bfqg;
+ prev = bfqg;
@@ -238,7 +247,8 @@ index 0000000..bcecdb4
+}
+
+/**
-+ * bfq_group_chain_link - link an allocated group chain to a cgroup hierarchy.
++ * bfq_group_chain_link - link an allocated group chain to a cgroup
++ * hierarchy.
+ * @bfqd: the queue descriptor.
+ * @css: the leaf cgroup_subsys_state to start from.
+ * @leaf: the leaf group (to be associated to @cgroup).
@@ -512,7 +522,8 @@ index 0000000..bcecdb4
+}
+
+/**
-+ * bfq_reparent_active_entities - move to the root group all active entities.
++ * bfq_reparent_active_entities - move to the root group all active
++ * entities.
+ * @bfqd: the device data structure with the root group.
+ * @bfqg: the group to move from.
+ * @st: the service tree with the entities.
@@ -557,8 +568,8 @@ index 0000000..bcecdb4
+ hlist_del(&bfqg->group_node);
+
+ /*
-+ * Empty all service_trees belonging to this group before deactivating
-+ * the group itself.
++ * Empty all service_trees belonging to this group before
++ * deactivating the group itself.
+ */
+ for (i = 0; i < BFQ_IOPRIO_CLASSES; i++) {
+ st = bfqg->sched_data.service_tree + i;
@@ -578,7 +589,7 @@ index 0000000..bcecdb4
+ * all the leaf entities corresponding to these queues
+ * to the root_group.
+ * Also, it may happen that the group has an entity
-+ * under service, which is disconnected from the active
++ * in service, which is disconnected from the active
+ * tree: it must be moved, too.
+ * There is no need to put the sync queues, as the
+ * scheduler has taken no reference.
@@ -616,14 +627,14 @@ index 0000000..bcecdb4
+ kfree(bfqg);
+}
+
-+static void bfq_end_raising_async(struct bfq_data *bfqd)
++static void bfq_end_wr_async(struct bfq_data *bfqd)
+{
+ struct hlist_node *tmp;
+ struct bfq_group *bfqg;
+
+ hlist_for_each_entry_safe(bfqg, tmp, &bfqd->group_list, bfqd_node)
-+ bfq_end_raising_async_queues(bfqd, bfqg);
-+ bfq_end_raising_async_queues(bfqd, bfqd->root_group);
++ bfq_end_wr_async_queues(bfqd, bfqg);
++ bfq_end_wr_async_queues(bfqd, bfqd->root_group);
+}
+
+/**
@@ -849,10 +860,11 @@ index 0000000..bcecdb4
+ ioc = task->io_context;
+ if (ioc != NULL && atomic_read(&ioc->nr_tasks) > 1)
+ /*
-+ * ioc == NULL means that the task is either too young
-+ * or exiting: if it has still no ioc the ioc can't be
-+ * shared, if the task is exiting the attach will fail
-+ * anyway, no matter what we return here.
++ * ioc == NULL means that the task is either too
++ * young or exiting: if it has still no ioc the
++ * ioc can't be shared, if the task is exiting the
++ * attach will fail anyway, no matter what we
++ * return here.
+ */
+ ret = -EINVAL;
+ task_unlock(task);
@@ -970,9 +982,9 @@ index 0000000..bcecdb4
+{
+}
+
-+static void bfq_end_raising_async(struct bfq_data *bfqd)
++static void bfq_end_wr_async(struct bfq_data *bfqd)
+{
-+ bfq_end_raising_async_queues(bfqd, bfqd->root_group);
++ bfq_end_wr_async_queues(bfqd, bfqd->root_group);
+}
+
+static inline void bfq_disconnect_groups(struct bfq_data *bfqd)
@@ -1044,10 +1056,10 @@ index 0000000..7f6b000
+}
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
new file mode 100644
-index 0000000..f5f71e4
+index 0000000..5aa5f09
--- /dev/null
+++ b/block/bfq-iosched.c
-@@ -0,0 +1,3300 @@
+@@ -0,0 +1,3902 @@
+/*
+ * Budget Fair Queueing (BFQ) disk scheduler.
+ *
@@ -1059,28 +1071,32 @@ index 0000000..f5f71e4
+ *
+ * Copyright (C) 2010 Paolo Valente <paolo.valente@unimore.it>
+ *
-+ * Licensed under the GPL-2 as detailed in the accompanying COPYING.BFQ file.
++ * Licensed under the GPL-2 as detailed in the accompanying COPYING.BFQ
++ * file.
+ *
-+ * BFQ is a proportional share disk scheduling algorithm based on the
-+ * slice-by-slice service scheme of CFQ. But BFQ assigns budgets, measured in
-+ * number of sectors, to tasks instead of time slices. The disk is not granted
-+ * to the in-service task for a given time slice, but until it has exhausted
-+ * its assigned budget. This change from the time to the service domain allows
-+ * BFQ to distribute the disk bandwidth among tasks as desired, without any
-+ * distortion due to ZBR, workload fluctuations or other factors. BFQ uses an
-+ * ad hoc internal scheduler, called B-WF2Q+, to schedule tasks according to
-+ * their budgets (more precisely BFQ schedules queues associated to tasks).
-+ * Thanks to this accurate scheduler, BFQ can afford to assign high budgets to
-+ * disk-bound non-seeky tasks (to boost the throughput), and yet guarantee low
-+ * latencies to interactive and soft real-time applications.
++ * BFQ is a proportional-share storage-I/O scheduling algorithm based on
++ * the slice-by-slice service scheme of CFQ. But BFQ assigns budgets,
++ * measured in number of sectors, to processes instead of time slices. The
++ * device is not granted to the in-service process for a given time slice,
++ * but until it has exhausted its assigned budget. This change from the time
++ * to the service domain allows BFQ to distribute the device throughput
++ * among processes as desired, without any distortion due to ZBR, workload
++ * fluctuations or other factors. BFQ uses an ad hoc internal scheduler,
++ * called B-WF2Q+, to schedule processes according to their budgets. More
++ * precisely, BFQ schedules queues associated to processes. Thanks to the
++ * accurate policy of B-WF2Q+, BFQ can afford to assign high budgets to
++ * I/O-bound processes issuing sequential requests (to boost the
++ * throughput), and yet guarantee a low latency to interactive and soft
++ * real-time applications.
+ *
+ * BFQ is described in [1], where also a reference to the initial, more
-+ * theoretical paper on BFQ can be found. The interested reader can find in
-+ * the latter paper full details on the main algorithm as well as formulas of
-+ * the guarantees, plus formal proofs of all the properties. With respect to
-+ * the version of BFQ presented in these papers, this implementation adds a
-+ * few more heuristics, such as the one that guarantees a low latency to soft
-+ * real-time applications, and a hierarchical extension based on H-WF2Q+.
++ * theoretical paper on BFQ can be found. The interested reader can find
++ * in the latter paper full details on the main algorithm, as well as
++ * formulas of the guarantees and formal proofs of all the properties.
++ * With respect to the version of BFQ presented in these papers, this
++ * implementation adds a few more heuristics, such as the one that
++ * guarantees a low latency to soft real-time applications, and a
++ * hierarchical extension based on H-WF2Q+.
+ *
+ * B-WF2Q+ is based on WF2Q+, that is described in [2], together with
+ * H-WF2Q+, while the augmented tree used to implement B-WF2Q+ with O(log N)
@@ -1165,21 +1181,46 @@ index 0000000..f5f71e4
+#define BFQ_RATE_SHIFT 16
+
+/*
-+ * The duration of the weight raising for interactive applications is
-+ * computed automatically (as default behaviour), using the following
-+ * formula: duration = (R / r) * T, where r is the peak rate of the
-+ * disk, and R and T are two reference parameters. In particular, R is
-+ * the peak rate of a reference disk, and T is about the maximum time
-+ * for starting popular large applications on that disk, under BFQ and
-+ * while reading two files in parallel. Finally, BFQ uses two
-+ * different pairs (R, T) depending on whether the disk is rotational
-+ * or non-rotational.
++ * By default, BFQ computes the duration of the weight raising for
++ * interactive applications automatically, using the following formula:
++ * duration = (R / r) * T, where r is the peak rate of the device, and
++ * R and T are two reference parameters.
++ * In particular, R is the peak rate of the reference device (see below),
++ * and T is a reference time: given the systems that are likely to be
++ * installed on the reference device according to its speed class, T is
++ * about the maximum time needed, under BFQ and while reading two files in
++ * parallel, to load typical large applications on these systems.
++ * In practice, the slower/faster the device at hand is, the more/less it
++ * takes to load applications with respect to the reference device.
++ * Accordingly, the longer/shorter BFQ grants weight raising to interactive
++ * applications.
++ *
++ * BFQ uses four different reference pairs (R, T), depending on:
++ * . whether the device is rotational or non-rotational;
++ * . whether the device is slow, such as old or portable HDDs, as well as
++ * SD cards, or fast, such as newer HDDs and SSDs.
++ *
++ * The device's speed class is dynamically (re)detected in
++ * bfq_update_peak_rate() every time the estimated peak rate is updated.
++ *
++ * In the following definitions, R_slow[0]/R_fast[0] and T_slow[0]/T_fast[0]
++ * are the reference values for a slow/fast rotational device, whereas
++ * R_slow[1]/R_fast[1] and T_slow[1]/T_fast[1] are the reference values for
++ * a slow/fast non-rotational device. Finally, device_speed_thresh are the
++ * thresholds used to switch between speed classes.
++ * Both the reference peak rates and the thresholds are measured in
++ * sectors/usec, left-shifted by BFQ_RATE_SHIFT.
++ */
++static int R_slow[2] = {1536, 10752};
++static int R_fast[2] = {17415, 34791};
++/*
++ * To improve readability, a conversion function is used to initialize the
++ * following arrays, which entails that they can be initialized only in a
++ * function.
+ */
-+#define T_rot (msecs_to_jiffies(5500))
-+#define T_nonrot (msecs_to_jiffies(2000))
-+/* Next two quantities are in sectors/usec, left-shifted by BFQ_RATE_SHIFT */
-+#define R_rot 17415
-+#define R_nonrot 34791
++static int T_slow[2];
++static int T_fast[2];
++static int device_speed_thresh[2];
+
+#define BFQ_SERVICE_TREE_INIT ((struct bfq_service_tree) \
+ { RB_ROOT, RB_ROOT, NULL, NULL, 0, 0 })
@@ -1385,6 +1426,125 @@ index 0000000..f5f71e4
+ bfqq->pos_root = NULL;
+}
+
++/*
++ * Tell whether there are active queues or groups with differentiated weights.
++ */
++static inline bool bfq_differentiated_weights(struct bfq_data *bfqd)
++{
++ BUG_ON(!bfqd->hw_tag);
++ /*
++ * For weights to differ, at least one of the trees must contain
++ * at least two nodes.
++ */
++ return (!RB_EMPTY_ROOT(&bfqd->queue_weights_tree) &&
++ (bfqd->queue_weights_tree.rb_node->rb_left ||
++ bfqd->queue_weights_tree.rb_node->rb_right)
++#ifdef CONFIG_CGROUP_BFQIO
++ ) ||
++ (!RB_EMPTY_ROOT(&bfqd->group_weights_tree) &&
++ (bfqd->group_weights_tree.rb_node->rb_left ||
++ bfqd->group_weights_tree.rb_node->rb_right)
++#endif
++ );
++}
++
++/*
++ * If the weight-counter tree passed as input contains no counter for
++ * the weight of the input entity, then add that counter; otherwise just
++ * increment the existing counter.
++ *
++ * Note that weight-counter trees contain few nodes in mostly symmetric
++ * scenarios. For example, if all queues have the same weight, then the
++ * weight-counter tree for the queues may contain at most one node.
++ * This holds even if low_latency is on, because weight-raised queues
++ * are not inserted in the tree.
++ * In most scenarios, the rate at which nodes are created/destroyed
++ * should be low too.
++ */
++static void bfq_weights_tree_add(struct bfq_data *bfqd,
++ struct bfq_entity *entity,
++ struct rb_root *root)
++{
++ struct rb_node **new = &(root->rb_node), *parent = NULL;
++
++ /*
++ * Do not insert if:
++ * - the device does not support queueing;
++ * - the entity is already associated with a counter, which happens if:
++ * 1) the entity is associated with a queue, 2) a request arrival
++ * has caused the queue to become both non-weight-raised, and hence
++ * change its weight, and backlogged; in this respect, each
++ * of the two events causes an invocation of this function,
++ * 3) this is the invocation of this function caused by the second
++ * event. This second invocation is actually useless, and we handle
++ * this fact by exiting immediately. More efficient or clearer
++ * solutions might possibly be adopted.
++ */
++ if (!bfqd->hw_tag || entity->weight_counter)
++ return;
++
++ while (*new) {
++ struct bfq_weight_counter *__counter = container_of(*new,
++ struct bfq_weight_counter,
++ weights_node);
++ parent = *new;
++
++ if (entity->weight == __counter->weight) {
++ entity->weight_counter = __counter;
++ goto inc_counter;
++ }
++ if (entity->weight < __counter->weight)
++ new = &((*new)->rb_left);
++ else
++ new = &((*new)->rb_right);
++ }
++
++ entity->weight_counter = kzalloc(sizeof(struct bfq_weight_counter),
++ GFP_ATOMIC);
++ entity->weight_counter->weight = entity->weight;
++ rb_link_node(&entity->weight_counter->weights_node, parent, new);
++ rb_insert_color(&entity->weight_counter->weights_node, root);
++
++inc_counter:
++ entity->weight_counter->num_active++;
++}
++
++/*
++ * Decrement the weight counter associated with the entity, and, if the
++ * counter reaches 0, remove the counter from the tree.
++ * See the comments to the function bfq_weights_tree_add() for considerations
++ * about overhead.
++ */
++static void bfq_weights_tree_remove(struct bfq_data *bfqd,
++ struct bfq_entity *entity,
++ struct rb_root *root)
++{
++ /*
++ * Check whether the entity is actually associated with a counter.
++ * In fact, the device may not be considered NCQ-capable for a while,
++ * which implies that no insertion in the weight trees is performed,
++ * after which the device may start to be deemed NCQ-capable, and hence
++ * this function may start to be invoked. This may cause the function
++ * to be invoked for entities that are not associated with any counter.
++ */
++ if (!entity->weight_counter)
++ return;
++
++ BUG_ON(RB_EMPTY_ROOT(root));
++ BUG_ON(entity->weight_counter->weight != entity->weight);
++
++ BUG_ON(!entity->weight_counter->num_active);
++ entity->weight_counter->num_active--;
++ if (entity->weight_counter->num_active > 0)
++ goto reset_entity_pointer;
++
++ rb_erase(&entity->weight_counter->weights_node, root);
++ kfree(entity->weight_counter);
++
++reset_entity_pointer:
++ entity->weight_counter = NULL;
++}
++
+static struct request *bfq_find_next_rq(struct bfq_data *bfqd,
+ struct bfq_queue *bfqq,
+ struct request *last)
@@ -1409,37 +1569,12 @@ index 0000000..f5f71e4
+ return bfq_choose_req(bfqd, next, prev, blk_rq_pos(last));
+}
+
-+static void bfq_del_rq_rb(struct request *rq)
-+{
-+ struct bfq_queue *bfqq = RQ_BFQQ(rq);
-+ struct bfq_data *bfqd = bfqq->bfqd;
-+ const int sync = rq_is_sync(rq);
-+
-+ BUG_ON(bfqq->queued[sync] == 0);
-+ bfqq->queued[sync]--;
-+ bfqd->queued--;
-+
-+ elv_rb_del(&bfqq->sort_list, rq);
-+
-+ if (RB_EMPTY_ROOT(&bfqq->sort_list)) {
-+ if (bfq_bfqq_busy(bfqq) && bfqq != bfqd->in_service_queue)
-+ bfq_del_bfqq_busy(bfqd, bfqq, 1);
-+ /*
-+ * Remove queue from request-position tree as it is empty.
-+ */
-+ if (bfqq->pos_root != NULL) {
-+ rb_erase(&bfqq->pos_node, bfqq->pos_root);
-+ bfqq->pos_root = NULL;
-+ }
-+ }
-+}
-+
+/* see the definition of bfq_async_charge_factor for details */
+static inline unsigned long bfq_serv_to_charge(struct request *rq,
+ struct bfq_queue *bfqq)
+{
+ return blk_rq_sectors(rq) *
-+ (1 + ((!bfq_bfqq_sync(bfqq)) * (bfqq->raising_coeff == 1) *
++ (1 + ((!bfq_bfqq_sync(bfqq)) * (bfqq->wr_coeff == 1) *
+ bfq_async_charge_factor));
+}
+
@@ -1477,17 +1612,20 @@ index 0000000..f5f71e4
+
+ new_budget = max_t(unsigned long, bfqq->max_budget,
+ bfq_serv_to_charge(next_rq, bfqq));
-+ entity->budget = new_budget;
-+ bfq_log_bfqq(bfqd, bfqq, "updated next rq: new budget %lu", new_budget);
-+ bfq_activate_bfqq(bfqd, bfqq);
++ if (entity->budget != new_budget) {
++ entity->budget = new_budget;
++ bfq_log_bfqq(bfqd, bfqq, "updated next rq: new budget %lu",
++ new_budget);
++ bfq_activate_bfqq(bfqd, bfqq);
++ }
+}
+
-+static inline unsigned int bfq_wrais_duration(struct bfq_data *bfqd)
++static inline unsigned int bfq_wr_duration(struct bfq_data *bfqd)
+{
+ u64 dur;
+
-+ if (bfqd->bfq_raising_max_time > 0)
-+ return bfqd->bfq_raising_max_time;
++ if (bfqd->bfq_wr_max_time > 0)
++ return bfqd->bfq_wr_max_time;
+
+ dur = bfqd->RT_prod;
+ do_div(dur, bfqd->peak_rate);
@@ -1495,16 +1633,230 @@ index 0000000..f5f71e4
+ return dur;
+}
+
-+static void bfq_add_rq_rb(struct request *rq)
++/* Empty burst list and add just bfqq (see comments to bfq_handle_burst) */
++static inline void bfq_reset_burst_list(struct bfq_data *bfqd,
++ struct bfq_queue *bfqq)
++{
++ struct bfq_queue *item;
++ struct hlist_node *n;
++
++ hlist_for_each_entry_safe(item, n, &bfqd->burst_list, burst_list_node)
++ hlist_del_init(&item->burst_list_node);
++ hlist_add_head(&bfqq->burst_list_node, &bfqd->burst_list);
++ bfqd->burst_size = 1;
++}
++
++/* Add bfqq to the list of queues in current burst (see bfq_handle_burst) */
++static void bfq_add_to_burst(struct bfq_data *bfqd, struct bfq_queue *bfqq)
++{
++ /* Increment burst size to take into account also bfqq */
++ bfqd->burst_size++;
++
++ if (bfqd->burst_size == bfqd->bfq_large_burst_thresh) {
++ struct bfq_queue *pos, *bfqq_item;
++ struct hlist_node *n;
++
++ /*
++ * Enough queues have been activated shortly after each
++ * other to consider this burst as large.
++ */
++ bfqd->large_burst = true;
++
++ /*
++ * We can now mark all queues in the burst list as
++ * belonging to a large burst.
++ */
++ hlist_for_each_entry(bfqq_item, &bfqd->burst_list,
++ burst_list_node)
++ bfq_mark_bfqq_in_large_burst(bfqq_item);
++ bfq_mark_bfqq_in_large_burst(bfqq);
++
++ /*
++ * From now on, and until the current burst finishes, any
++ * new queue being activated shortly after the last queue
++ * was inserted in the burst can be immediately marked as
++ * belonging to a large burst. So the burst list is not
++ * needed any more. Remove it.
++ */
++ hlist_for_each_entry_safe(pos, n, &bfqd->burst_list,
++ burst_list_node)
++ hlist_del_init(&pos->burst_list_node);
++ } else /* burst not yet large: add bfqq to the burst list */
++ hlist_add_head(&bfqq->burst_list_node, &bfqd->burst_list);
++}
++
++/*
++ * If many queues happen to become active shortly after each other, then,
++ * to help the processes associated to these queues get their job done as
++ * soon as possible, it is usually better to not grant either weight-raising
++ * or device idling to these queues. In this comment we describe, firstly,
++ * the reasons why this fact holds, and, secondly, the next function, which
++ * implements the main steps needed to properly mark these queues so that
++ * they can then be treated in a different way.
++ *
++ * As for the terminology, we say that a queue becomes active, i.e.,
++ * switches from idle to backlogged, either when it is created (as a
++ * consequence of the arrival of an I/O request), or, if already existing,
++ * when a new request for the queue arrives while the queue is idle.
++ * Bursts of activations, i.e., activations of different queues occurring
++ * shortly after each other, are typically caused by services or applications
++ * that spawn or reactivate many parallel threads/processes. Examples are
++ * systemd during boot or git grep.
++ *
++ * These services or applications benefit mostly from a high throughput:
++ * the quicker the requests of the activated queues are cumulatively served,
++ * the sooner the target job of these queues gets completed. As a consequence,
++ * weight-raising any of these queues, which also implies idling the device
++ * for it, is almost always counterproductive: in most cases it just lowers
++ * throughput.
++ *
++ * On the other hand, a burst of activations may be also caused by the start
++ * of an application that does not consist in a lot of parallel I/O-bound
++ * threads. In fact, with a complex application, the burst may be just a
++ * consequence of the fact that several processes need to be executed to
++ * start-up the application. To start an application as quickly as possible,
++ * the best thing to do is to privilege the I/O related to the application
++ * with respect to all other I/O. Therefore, the best strategy to start as
++ * quickly as possible an application that causes a burst of activations is
++ * to weight-raise all the queues activated during the burst. This is the
++ * exact opposite of the best strategy for the other type of bursts.
++ *
++ * In the end, to take the best action for each of the two cases, the two
++ * types of bursts need to be distinguished. Fortunately, this seems
++ * relatively easy to do, by looking at the sizes of the bursts. In
++ * particular, we found a threshold such that bursts with a larger size
++ * than that threshold are apparently caused only by services or commands
++ * such as systemd or git grep. For brevity, hereafter we call just 'large'
++ * these bursts. BFQ *does not* weight-raise queues whose activations occur
++ * in a large burst. In addition, for each of these queues BFQ performs or
++ * does not perform idling depending on which choice boosts the throughput
++ * most. The exact choice depends on the device and request pattern at
++ * hand.
++ *
++ * Turning back to the next function, it implements all the steps needed
++ * to detect the occurrence of a large burst and to properly mark all the
++ * queues belonging to it (so that they can then be treated in a different
++ * way). This goal is achieved by maintaining a special "burst list" that
++ * holds, temporarily, the queues that belong to the burst in progress. The
++ * list is then used to mark these queues as belonging to a large burst if
++ * the burst does become large. The main steps are the following.
++ *
++ * . when the very first queue is activated, the queue is inserted into the
++ * list (as it could be the first queue in a possible burst)
++ *
++ * . if the current burst has not yet become large, and a queue Q that does
++ * not yet belong to the burst is activated shortly after the last time
++ * at which a new queue entered the burst list, then the function appends
++ * Q to the burst list
++ *
++ * . if, as a consequence of the previous step, the burst size reaches
++ * the large-burst threshold, then
++ *
++ * . all the queues in the burst list are marked as belonging to a
++ * large burst
++ *
++ * . the burst list is deleted; in fact, the burst list already served
++ * its purpose (keeping temporarily track of the queues in a burst,
++ * so as to be able to mark them as belonging to a large burst in the
++ * previous sub-step), and now is not needed any more
++ *
++ * . the device enters a large-burst mode
++ *
++ * . if a queue Q that does not belong to the burst is activated while
++ * the device is in large-burst mode and shortly after the last time
++ * at which a queue either entered the burst list or was marked as
++ * belonging to the current large burst, then Q is immediately marked
++ * as belonging to a large burst.
++ *
++ * . if a queue Q that does not belong to the burst is activated a while
++ * later, i.e., not shortly after, than the last time at which a queue
++ * either entered the burst list or was marked as belonging to the
++ * current large burst, then the current burst is deemed as finished and:
++ *
++ * . the large-burst mode is reset if set
++ *
++ * . the burst list is emptied
++ *
++ * . Q is inserted in the burst list, as Q may be the first queue
++ * in a possible new burst (then the burst list contains just Q
++ * after this step).
++ */
++static void bfq_handle_burst(struct bfq_data *bfqd, struct bfq_queue *bfqq,
++ bool idle_for_long_time)
++{
++ /*
++ * If bfqq happened to be activated in a burst, but has been idle
++ * for at least as long as an interactive queue, then we assume
++ * that, in the overall I/O initiated in the burst, the I/O
++ * associated to bfqq is finished. So bfqq does not need to be
++ * treated as a queue belonging to a burst anymore. Accordingly,
++ * we reset bfqq's in_large_burst flag if set, and remove bfqq
++ * from the burst list if it's there. We do not decrement instead
++ * burst_size, because the fact that bfqq does not need to belong
++ * to the burst list any more does not invalidate the fact that
++ * bfqq may have been activated during the current burst.
++ */
++ if (idle_for_long_time) {
++ hlist_del_init(&bfqq->burst_list_node);
++ bfq_clear_bfqq_in_large_burst(bfqq);
++ }
++
++ /*
++ * If bfqq is already in the burst list or is part of a large
++ * burst, then there is nothing else to do.
++ */
++ if (!hlist_unhashed(&bfqq->burst_list_node) ||
++ bfq_bfqq_in_large_burst(bfqq))
++ return;
++
++ /*
++ * If bfqq's activation happens late enough, then the current
++ * burst is finished, and related data structures must be reset.
++ *
++ * In this respect, consider the special case where bfqq is the very
++ * first queue being activated. In this case, last_ins_in_burst is
++ * not yet significant when we get here. But it is easy to verify
++ * that, whether or not the following condition is true, bfqq will
++ * end up being inserted into the burst list. In particular the
++ * list will happen to contain only bfqq. And this is exactly what
++ * has to happen, as bfqq may be the first queue in a possible
++ * burst.
++ */
++ if (time_is_before_jiffies(bfqd->last_ins_in_burst +
++ bfqd->bfq_burst_interval)) {
++ bfqd->large_burst = false;
++ bfq_reset_burst_list(bfqd, bfqq);
++ return;
++ }
++
++ /*
++ * If we get here, then bfqq is being activated shortly after the
++ * last queue. So, if the current burst is also large, we can mark
++ * bfqq as belonging to this large burst immediately.
++ */
++ if (bfqd->large_burst) {
++ bfq_mark_bfqq_in_large_burst(bfqq);
++ return;
++ }
++
++ /*
++ * If we get here, then a large-burst state has not yet been
++ * reached, but bfqq is being activated shortly after the last
++ * queue. Then we add bfqq to the burst.
++ */
++ bfq_add_to_burst(bfqd, bfqq);
++}
++
++static void bfq_add_request(struct request *rq)
+{
+ struct bfq_queue *bfqq = RQ_BFQQ(rq);
+ struct bfq_entity *entity = &bfqq->entity;
+ struct bfq_data *bfqd = bfqq->bfqd;
+ struct request *next_rq, *prev;
-+ unsigned long old_raising_coeff = bfqq->raising_coeff;
-+ int idle_for_long_time = 0;
++ unsigned long old_wr_coeff = bfqq->wr_coeff;
++ bool interactive = false;
+
-+ bfq_log_bfqq(bfqd, bfqq, "add_rq_rb %d", rq_is_sync(rq));
++ bfq_log_bfqq(bfqd, bfqq, "add_request %d", rq_is_sync(rq));
+ bfqq->queued[rq_is_sync(rq)]++;
+ bfqd->queued++;
+
@@ -1525,14 +1877,50 @@ index 0000000..f5f71e4
+ bfq_rq_pos_tree_add(bfqd, bfqq);
+
+ if (!bfq_bfqq_busy(bfqq)) {
-+ int soft_rt = bfqd->bfq_raising_max_softrt_rate > 0 &&
++ bool soft_rt,
++ idle_for_long_time = time_is_before_jiffies(
++ bfqq->budget_timeout +
++ bfqd->bfq_wr_min_idle_time);
++
++ if (bfq_bfqq_sync(bfqq)) {
++ bool already_in_burst =
++ !hlist_unhashed(&bfqq->burst_list_node) ||
++ bfq_bfqq_in_large_burst(bfqq);
++ bfq_handle_burst(bfqd, bfqq, idle_for_long_time);
++ /*
++ * If bfqq was not already in the current burst,
++ * then, at this point, bfqq either has been
++ * added to the current burst or has caused the
++ * current burst to terminate. In particular, in
++ * the second case, bfqq has become the first
++ * queue in a possible new burst.
++ * In both cases last_ins_in_burst needs to be
++ * moved forward.
++ */
++ if (!already_in_burst)
++ bfqd->last_ins_in_burst = jiffies;
++ }
++
++ soft_rt = bfqd->bfq_wr_max_softrt_rate > 0 &&
++ !bfq_bfqq_in_large_burst(bfqq) &&
+ time_is_before_jiffies(bfqq->soft_rt_next_start);
-+ idle_for_long_time = time_is_before_jiffies(
-+ bfqq->budget_timeout +
-+ bfqd->bfq_raising_min_idle_time);
++ interactive = !bfq_bfqq_in_large_burst(bfqq) &&
++ idle_for_long_time;
+ entity->budget = max_t(unsigned long, bfqq->max_budget,
+ bfq_serv_to_charge(next_rq, bfqq));
+
++ if (!bfq_bfqq_IO_bound(bfqq)) {
++ if (time_before(jiffies,
++ RQ_BIC(rq)->ttime.last_end_request +
++ bfqd->bfq_slice_idle)) {
++ bfqq->requests_within_timer++;
++ if (bfqq->requests_within_timer >=
++ bfqd->bfq_requests_within_timer)
++ bfq_mark_bfqq_IO_bound(bfqq);
++ } else
++ bfqq->requests_within_timer = 0;
++ }
++
+ if (!bfqd->low_latency)
+ goto add_bfqq_busy;
+
@@ -1540,45 +1928,40 @@ index 0000000..f5f71e4
+ * If the queue is not being boosted and has been idle
+ * for enough time, start a weight-raising period
+ */
-+ if (old_raising_coeff == 1 &&
-+ (idle_for_long_time || soft_rt)) {
-+ bfqq->raising_coeff = bfqd->bfq_raising_coeff;
-+ if (idle_for_long_time)
-+ bfqq->raising_cur_max_time =
-+ bfq_wrais_duration(bfqd);
++ if (old_wr_coeff == 1 && (interactive || soft_rt)) {
++ bfqq->wr_coeff = bfqd->bfq_wr_coeff;
++ if (interactive)
++ bfqq->wr_cur_max_time = bfq_wr_duration(bfqd);
+ else
-+ bfqq->raising_cur_max_time =
-+ bfqd->bfq_raising_rt_max_time;
++ bfqq->wr_cur_max_time =
++ bfqd->bfq_wr_rt_max_time;
+ bfq_log_bfqq(bfqd, bfqq,
-+ "wrais starting at %lu, "
-+ "rais_max_time %u",
++ "wrais starting at %lu, rais_max_time %u",
+ jiffies,
-+ jiffies_to_msecs(bfqq->
-+ raising_cur_max_time));
-+ } else if (old_raising_coeff > 1) {
-+ if (idle_for_long_time)
-+ bfqq->raising_cur_max_time =
-+ bfq_wrais_duration(bfqd);
-+ else if (bfqq->raising_cur_max_time ==
-+ bfqd->bfq_raising_rt_max_time &&
-+ !soft_rt) {
-+ bfqq->raising_coeff = 1;
++ jiffies_to_msecs(bfqq->wr_cur_max_time));
++ } else if (old_wr_coeff > 1) {
++ if (interactive)
++ bfqq->wr_cur_max_time = bfq_wr_duration(bfqd);
++ else if (bfq_bfqq_in_large_burst(bfqq) ||
++ (bfqq->wr_cur_max_time ==
++ bfqd->bfq_wr_rt_max_time &&
++ !soft_rt)) {
++ bfqq->wr_coeff = 1;
+ bfq_log_bfqq(bfqd, bfqq,
-+ "wrais ending at %lu, "
-+ "rais_max_time %u",
-+ jiffies,
-+ jiffies_to_msecs(bfqq->
-+ raising_cur_max_time));
++ "wrais ending at %lu, rais_max_time %u",
++ jiffies,
++ jiffies_to_msecs(bfqq->
++ wr_cur_max_time));
+ } else if (time_before(
-+ bfqq->last_rais_start_finish +
-+ bfqq->raising_cur_max_time,
++ bfqq->last_wr_start_finish +
++ bfqq->wr_cur_max_time,
+ jiffies +
-+ bfqd->bfq_raising_rt_max_time) &&
++ bfqd->bfq_wr_rt_max_time) &&
+ soft_rt) {
+ /*
+ *
+ * The remaining weight-raising time is lower
-+ * than bfqd->bfq_raising_rt_max_time, which
++ * than bfqd->bfq_wr_rt_max_time, which
+ * means that the application is enjoying
+ * weight raising either because deemed soft-
+ * rt in the near past, or because deemed
@@ -1619,12 +2002,12 @@ index 0000000..f5f71e4
+ * latency because the application is not
+ * weight-raised while they are pending.
+ */
-+ bfqq->last_rais_start_finish = jiffies;
-+ bfqq->raising_cur_max_time =
-+ bfqd->bfq_raising_rt_max_time;
++ bfqq->last_wr_start_finish = jiffies;
++ bfqq->wr_cur_max_time =
++ bfqd->bfq_wr_rt_max_time;
+ }
+ }
-+ if (old_raising_coeff != bfqq->raising_coeff)
++ if (old_wr_coeff != bfqq->wr_coeff)
+ entity->ioprio_changed = 1;
+add_bfqq_busy:
+ bfqq->last_idle_bklogged = jiffies;
@@ -1632,38 +2015,27 @@ index 0000000..f5f71e4
+ bfq_clear_bfqq_softrt_update(bfqq);
+ bfq_add_bfqq_busy(bfqd, bfqq);
+ } else {
-+ if (bfqd->low_latency && old_raising_coeff == 1 &&
-+ !rq_is_sync(rq) &&
-+ time_is_before_jiffies(
-+ bfqq->last_rais_start_finish +
-+ bfqd->bfq_raising_min_inter_arr_async)) {
-+ bfqq->raising_coeff = bfqd->bfq_raising_coeff;
-+ bfqq->raising_cur_max_time = bfq_wrais_duration(bfqd);
-+
-+ bfqd->raised_busy_queues++;
++ if (bfqd->low_latency && old_wr_coeff == 1 && !rq_is_sync(rq) &&
++ time_is_before_jiffies(
++ bfqq->last_wr_start_finish +
++ bfqd->bfq_wr_min_inter_arr_async)) {
++ bfqq->wr_coeff = bfqd->bfq_wr_coeff;
++ bfqq->wr_cur_max_time = bfq_wr_duration(bfqd);
++
++ bfqd->wr_busy_queues++;
+ entity->ioprio_changed = 1;
+ bfq_log_bfqq(bfqd, bfqq,
-+ "non-idle wrais starting at %lu, "
-+ "rais_max_time %u",
-+ jiffies,
-+ jiffies_to_msecs(bfqq->
-+ raising_cur_max_time));
++ "non-idle wrais starting at %lu, rais_max_time %u",
++ jiffies,
++ jiffies_to_msecs(bfqq->wr_cur_max_time));
+ }
-+ bfq_updated_next_req(bfqd, bfqq);
++ if (prev != bfqq->next_rq)
++ bfq_updated_next_req(bfqd, bfqq);
+ }
+
+ if (bfqd->low_latency &&
-+ (old_raising_coeff == 1 || bfqq->raising_coeff == 1 ||
-+ idle_for_long_time))
-+ bfqq->last_rais_start_finish = jiffies;
-+}
-+
-+static void bfq_reposition_rq_rb(struct bfq_queue *bfqq, struct request *rq)
-+{
-+ elv_rb_del(&bfqq->sort_list, rq);
-+ bfqq->queued[rq_is_sync(rq)]--;
-+ bfqq->bfqd->queued--;
-+ bfq_add_rq_rb(rq);
++ (old_wr_coeff == 1 || bfqq->wr_coeff == 1 || interactive))
++ bfqq->last_wr_start_finish = jiffies;
+}
+
+static struct request *bfq_find_rq_fmerge(struct bfq_data *bfqd,
@@ -1694,11 +2066,12 @@ index 0000000..f5f71e4
+ (long long unsigned)bfqd->last_position);
+}
+
-+static void bfq_deactivate_request(struct request_queue *q, struct request *rq)
++static inline void bfq_deactivate_request(struct request_queue *q,
++ struct request *rq)
+{
+ struct bfq_data *bfqd = q->elevator->elevator_data;
+
-+ WARN_ON(bfqd->rq_in_driver == 0);
++ BUG_ON(bfqd->rq_in_driver == 0);
+ bfqd->rq_in_driver--;
+}
+
@@ -1706,6 +2079,7 @@ index 0000000..f5f71e4
+{
+ struct bfq_queue *bfqq = RQ_BFQQ(rq);
+ struct bfq_data *bfqd = bfqq->bfqd;
++ const int sync = rq_is_sync(rq);
+
+ if (bfqq->next_rq == rq) {
+ bfqq->next_rq = bfq_find_next_rq(bfqd, bfqq, rq);
@@ -1713,10 +2087,25 @@ index 0000000..f5f71e4
+ }
+
+ list_del_init(&rq->queuelist);
-+ bfq_del_rq_rb(rq);
++ BUG_ON(bfqq->queued[sync] == 0);
++ bfqq->queued[sync]--;
++ bfqd->queued--;
++ elv_rb_del(&bfqq->sort_list, rq);
++
++ if (RB_EMPTY_ROOT(&bfqq->sort_list)) {
++ if (bfq_bfqq_busy(bfqq) && bfqq != bfqd->in_service_queue)
++ bfq_del_bfqq_busy(bfqd, bfqq, 1);
++ /*
++ * Remove queue from request-position tree as it is empty.
++ */
++ if (bfqq->pos_root != NULL) {
++ rb_erase(&bfqq->pos_node, bfqq->pos_root);
++ bfqq->pos_root = NULL;
++ }
++ }
+
+ if (rq->cmd_flags & REQ_META) {
-+ WARN_ON(bfqq->meta_pending == 0);
++ BUG_ON(bfqq->meta_pending == 0);
+ bfqq->meta_pending--;
+ }
+}
@@ -1739,10 +2128,33 @@ index 0000000..f5f71e4
+static void bfq_merged_request(struct request_queue *q, struct request *req,
+ int type)
+{
-+ if (type == ELEVATOR_FRONT_MERGE) {
++ if (type == ELEVATOR_FRONT_MERGE &&
++ rb_prev(&req->rb_node) &&
++ blk_rq_pos(req) <
++ blk_rq_pos(container_of(rb_prev(&req->rb_node),
++ struct request, rb_node))) {
+ struct bfq_queue *bfqq = RQ_BFQQ(req);
-+
-+ bfq_reposition_rq_rb(bfqq, req);
++ struct bfq_data *bfqd = bfqq->bfqd;
++ struct request *prev, *next_rq;
++
++ /* Reposition request in its sort_list */
++ elv_rb_del(&bfqq->sort_list, req);
++ elv_rb_add(&bfqq->sort_list, req);
++ /* Choose next request to be served for bfqq */
++ prev = bfqq->next_rq;
++ next_rq = bfq_choose_req(bfqd, bfqq->next_rq, req,
++ bfqd->last_position);
++ BUG_ON(next_rq == NULL);
++ bfqq->next_rq = next_rq;
++ /*
++ * If next_rq changes, update both the queue's budget to
++ * fit the new request and the queue's position in its
++ * rq_pos_tree.
++ */
++ if (prev != bfqq->next_rq) {
++ bfq_updated_next_req(bfqd, bfqq);
++ bfq_rq_pos_tree_add(bfqd, bfqq);
++ }
+ }
+}
+
@@ -1767,41 +2179,41 @@ index 0000000..f5f71e4
+}
+
+/* Must be called with bfqq != NULL */
-+static inline void bfq_bfqq_end_raising(struct bfq_queue *bfqq)
++static inline void bfq_bfqq_end_wr(struct bfq_queue *bfqq)
+{
+ BUG_ON(bfqq == NULL);
+ if (bfq_bfqq_busy(bfqq))
-+ bfqq->bfqd->raised_busy_queues--;
-+ bfqq->raising_coeff = 1;
-+ bfqq->raising_cur_max_time = 0;
++ bfqq->bfqd->wr_busy_queues--;
++ bfqq->wr_coeff = 1;
++ bfqq->wr_cur_max_time = 0;
+ /* Trigger a weight change on the next activation of the queue */
+ bfqq->entity.ioprio_changed = 1;
+}
+
-+static void bfq_end_raising_async_queues(struct bfq_data *bfqd,
-+ struct bfq_group *bfqg)
++static void bfq_end_wr_async_queues(struct bfq_data *bfqd,
++ struct bfq_group *bfqg)
+{
+ int i, j;
+
+ for (i = 0; i < 2; i++)
+ for (j = 0; j < IOPRIO_BE_NR; j++)
+ if (bfqg->async_bfqq[i][j] != NULL)
-+ bfq_bfqq_end_raising(bfqg->async_bfqq[i][j]);
++ bfq_bfqq_end_wr(bfqg->async_bfqq[i][j]);
+ if (bfqg->async_idle_bfqq != NULL)
-+ bfq_bfqq_end_raising(bfqg->async_idle_bfqq);
++ bfq_bfqq_end_wr(bfqg->async_idle_bfqq);
+}
+
-+static void bfq_end_raising(struct bfq_data *bfqd)
++static void bfq_end_wr(struct bfq_data *bfqd)
+{
+ struct bfq_queue *bfqq;
+
+ spin_lock_irq(bfqd->queue->queue_lock);
+
+ list_for_each_entry(bfqq, &bfqd->active_list, bfqq_list)
-+ bfq_bfqq_end_raising(bfqq);
++ bfq_bfqq_end_wr(bfqq);
+ list_for_each_entry(bfqq, &bfqd->idle_list, bfqq_list)
-+ bfq_bfqq_end_raising(bfqq);
-+ bfq_end_raising_async(bfqd);
++ bfq_bfqq_end_wr(bfqq);
++ bfq_end_wr_async(bfqd);
+
+ spin_unlock_irq(bfqd->queue->queue_lock);
+}
@@ -1904,8 +2316,8 @@ index 0000000..f5f71e4
+
+ /*
+ * If the exact sector wasn't found, the parent of the NULL leaf
-+ * will contain the closest sector (rq_pos_tree sorted by next_request
-+ * position).
++ * will contain the closest sector (rq_pos_tree sorted by
++ * next_request position).
+ */
+ __bfqq = rb_entry(parent, struct bfq_queue, pos_node);
+ if (bfq_rq_close(bfqd, __bfqq->next_rq))
@@ -2007,35 +2419,15 @@ index 0000000..f5f71e4
+ return bfqd->bfq_max_budget / 32;
+}
+
-+/*
-+ * Decides whether idling should be done for given device and
-+ * given in-service queue.
-+ */
-+static inline bool bfq_queue_nonrot_noidle(struct bfq_data *bfqd,
-+ struct bfq_queue *in_service_bfqq)
-+{
-+ if (in_service_bfqq == NULL)
-+ return false;
-+ /*
-+ * If the device is non-rotational, and hence has no seek penalty,
-+ * disable idling; but do so only if:
-+ * - device does not support queuing, otherwise we still have
-+ * a problem with sync vs async workloads;
-+ * - the queue is not weight-raised, to preserve guarantees.
-+ */
-+ return blk_queue_nonrot(bfqd->queue) && bfqd->hw_tag &&
-+ (in_service_bfqq->raising_coeff == 1);
-+}
-+
+static void bfq_arm_slice_timer(struct bfq_data *bfqd)
+{
+ struct bfq_queue *bfqq = bfqd->in_service_queue;
+ struct bfq_io_cq *bic;
+ unsigned long sl;
+
-+ WARN_ON(!RB_EMPTY_ROOT(&bfqq->sort_list));
++ BUG_ON(!RB_EMPTY_ROOT(&bfqq->sort_list));
+
-+ /* Tasks have exited, don't wait. */
++ /* Processes have exited, don't wait. */
+ bic = bfqd->in_service_bic;
+ if (bic == NULL || atomic_read(&bic->icq.ioc->active_ref) == 0)
+ return;
@@ -2053,11 +2445,17 @@ index 0000000..f5f71e4
+ * BFQ_MIN_TT. This happened to help reduce latency.
+ */
+ sl = bfqd->bfq_slice_idle;
-+ if (bfq_sample_valid(bfqq->seek_samples) && BFQQ_SEEKY(bfqq) &&
-+ bfqq->entity.service > bfq_max_budget(bfqd) / 8 &&
-+ bfqq->raising_coeff == 1)
++ /*
++ * Unless the queue is being weight-raised, grant only minimum idle
++ * time if the queue either has been seeky for long enough or has
++ * already proved to be constantly seeky.
++ */
++ if (bfq_sample_valid(bfqq->seek_samples) &&
++ ((BFQQ_SEEKY(bfqq) && bfqq->entity.service >
++ bfq_max_budget(bfqq->bfqd) / 8) ||
++ bfq_bfqq_constantly_seeky(bfqq)) && bfqq->wr_coeff == 1)
+ sl = min(sl, msecs_to_jiffies(BFQ_MIN_TT));
-+ else if (bfqq->raising_coeff > 1)
++ else if (bfqq->wr_coeff > 1)
+ sl = sl * 3;
+ bfqd->last_idling_start = ktime_get();
+ mod_timer(&bfqd->idle_slice_timer, jiffies + sl);
@@ -2074,7 +2472,7 @@ index 0000000..f5f71e4
+{
+ struct bfq_queue *bfqq = bfqd->in_service_queue;
+ unsigned int timeout_coeff;
-+ if (bfqq->raising_cur_max_time == bfqd->bfq_raising_rt_max_time)
++ if (bfqq->wr_cur_max_time == bfqd->bfq_wr_rt_max_time)
+ timeout_coeff = 1;
+ else
+ timeout_coeff = bfqq->entity.weight / bfqq->entity.orig_weight;
@@ -2098,8 +2496,18 @@ index 0000000..f5f71e4
+ struct bfq_data *bfqd = q->elevator->elevator_data;
+ struct bfq_queue *bfqq = RQ_BFQQ(rq);
+
-+ bfq_remove_request(rq);
++ /*
++ * For consistency, the next instruction should have been executed
++ * after removing the request from the queue and dispatching it.
++ * We execute instead this instruction before bfq_remove_request()
++ * (and hence introduce a temporary inconsistency), for efficiency.
++ * In fact, in a forced_dispatch, this prevents two counters related
++ * to bfqq->dispatched to risk to be uselessly decremented if bfqq
++ * is not in service, and then to be incremented again after
++ * incrementing bfqq->dispatched.
++ */
+ bfqq->dispatched++;
++ bfq_remove_request(rq);
+ elv_dispatch_sort(q, rq);
+
+ if (bfq_bfqq_sync(bfqq))
@@ -2129,9 +2537,7 @@ index 0000000..f5f71e4
+ return rq;
+}
+
-+/*
-+ * Must be called with the queue_lock held.
-+ */
++/* Must be called with the queue_lock held. */
+static int bfqq_process_refs(struct bfq_queue *bfqq)
+{
+ int process_refs, io_refs;
@@ -2209,9 +2615,9 @@ index 0000000..f5f71e4
+
+ if (RB_EMPTY_ROOT(&bfqq->sort_list)) {
+ /*
-+ * overloading budget_timeout field to store when
-+ * the queue remains with no backlog, used by
-+ * the weight-raising mechanism
++ * Overloading budget_timeout field to store the time
++ * at which the queue remains with no backlog; used by
++ * the weight-raising mechanism.
+ */
+ bfqq->budget_timeout = jiffies;
+ bfq_del_bfqq_busy(bfqd, bfqq, 1);
@@ -2439,11 +2845,26 @@ index 0000000..f5f71e4
+ bfqd->peak_rate_samples++;
+
+ if (bfqd->peak_rate_samples == BFQ_PEAK_RATE_SAMPLES &&
-+ update && bfqd->bfq_user_max_budget == 0) {
-+ bfqd->bfq_max_budget =
-+ bfq_calc_max_budget(bfqd->peak_rate, timeout);
-+ bfq_log(bfqd, "new max_budget=%lu",
-+ bfqd->bfq_max_budget);
++ update) {
++ int dev_type = blk_queue_nonrot(bfqd->queue);
++ if (bfqd->bfq_user_max_budget == 0) {
++ bfqd->bfq_max_budget =
++ bfq_calc_max_budget(bfqd->peak_rate,
++ timeout);
++ bfq_log(bfqd, "new max_budget=%lu",
++ bfqd->bfq_max_budget);
++ }
++ if (bfqd->device_speed == BFQ_BFQD_FAST &&
++ bfqd->peak_rate < device_speed_thresh[dev_type]) {
++ bfqd->device_speed = BFQ_BFQD_SLOW;
++ bfqd->RT_prod = R_slow[dev_type] *
++ T_slow[dev_type];
++ } else if (bfqd->device_speed == BFQ_BFQD_SLOW &&
++ bfqd->peak_rate > device_speed_thresh[dev_type]) {
++ bfqd->device_speed = BFQ_BFQD_FAST;
++ bfqd->RT_prod = R_fast[dev_type] *
++ T_fast[dev_type];
++ }
+ }
+ }
+
@@ -2479,10 +2900,10 @@ index 0000000..f5f71e4
+}
+
+/*
-+ * To be deemed as soft real-time, an application must meet two requirements.
-+ * First, the application must not require an average bandwidth higher than
-+ * the approximate bandwidth required to playback or record a compressed high-
-+ * definition video.
++ * To be deemed as soft real-time, an application must meet two
++ * requirements. First, the application must not require an average
++ * bandwidth higher than the approximate bandwidth required to playback or
++ * record a compressed high-definition video.
+ * The next function is invoked on the completion of the last request of a
+ * batch, to compute the next-start time instant, soft_rt_next_start, such
+ * that, if the next request of the application does not arrive before
@@ -2493,30 +2914,31 @@ index 0000000..f5f71e4
+ * the application stops issuing new requests until all its pending requests
+ * have been completed. After that, the application may issue a new batch,
+ * and so on.
-+ * For this reason the next function is invoked to compute soft_rt_next_start
-+ * only for applications that meet this requirement, whereas soft_rt_next_start
-+ * is set to infinity for applications that do not.
++ * For this reason the next function is invoked to compute
++ * soft_rt_next_start only for applications that meet this requirement,
++ * whereas soft_rt_next_start is set to infinity for applications that do
++ * not.
+ *
+ * Unfortunately, even a greedy application may happen to behave in an
-+ * isochronous way if the CPU load is high. In fact, the application may stop
-+ * issuing requests while the CPUs are busy serving other processes, then
-+ * restart, then stop again for a while, and so on. In addition, if the disk
-+ * achieves a low enough throughput with the request pattern issued by the
-+ * application (e.g., because the request pattern is random and/or the device
-+ * is slow), then the application may meet the above bandwidth requirement too.
-+ * To prevent such a greedy application to be deemed as soft real-time, a
-+ * further rule is used in the computation of soft_rt_next_start:
-+ * soft_rt_next_start must be higher than the current time plus the maximum
-+ * time for which the arrival of a request is waited for when a sync queue
-+ * becomes idle, namely bfqd->bfq_slice_idle.
-+ * This filters out greedy applications, as the latter issue instead their next
-+ * request as soon as possible after the last one has been completed (in
-+ * contrast, when a batch of requests is completed, a soft real-time application
-+ * spends some time processing data).
++ * isochronous way if the CPU load is high. In fact, the application may
++ * stop issuing requests while the CPUs are busy serving other processes,
++ * then restart, then stop again for a while, and so on. In addition, if
++ * the disk achieves a low enough throughput with the request pattern
++ * issued by the application (e.g., because the request pattern is random
++ * and/or the device is slow), then the application may meet the above
++ * bandwidth requirement too. To prevent such a greedy application to be
++ * deemed as soft real-time, a further rule is used in the computation of
++ * soft_rt_next_start: soft_rt_next_start must be higher than the current
++ * time plus the maximum time for which the arrival of a request is waited
++ * for when a sync queue becomes idle, namely bfqd->bfq_slice_idle.
++ * This filters out greedy applications, as the latter issue instead their
++ * next request as soon as possible after the last one has been completed
++ * (in contrast, when a batch of requests is completed, a soft real-time
++ * application spends some time processing data).
+ *
-+ * Unfortunately, the last filter may easily generate false positives if only
-+ * bfqd->bfq_slice_idle is used as a reference time interval and one or both
-+ * the following cases occur:
++ * Unfortunately, the last filter may easily generate false positives if
++ * only bfqd->bfq_slice_idle is used as a reference time interval and one
++ * or both the following cases occur:
+ * 1) HZ is so low that the duration of a jiffy is comparable to or higher
+ * than bfqd->bfq_slice_idle. This happens, e.g., on slow devices with
+ * HZ=100.
@@ -2525,15 +2947,16 @@ index 0000000..f5f71e4
+ * increments. This seems to happen, e.g., inside virtual machines.
+ * To address this issue, we do not use as a reference time interval just
+ * bfqd->bfq_slice_idle, but bfqd->bfq_slice_idle plus a few jiffies. In
-+ * particular we add the minimum number of jiffies for which the filter seems
-+ * to be quite precise also in embedded systems and KVM/QEMU virtual machines.
++ * particular we add the minimum number of jiffies for which the filter
++ * seems to be quite precise also in embedded systems and KVM/QEMU virtual
++ * machines.
+ */
+static inline unsigned long bfq_bfqq_softrt_next_start(struct bfq_data *bfqd,
+ struct bfq_queue *bfqq)
+{
+ return max(bfqq->last_idle_bklogged +
+ HZ * bfqq->service_from_backlogged /
-+ bfqd->bfq_raising_max_softrt_rate,
++ bfqd->bfq_wr_max_softrt_rate,
+ jiffies + bfqq->bfqd->bfq_slice_idle + 4);
+}
+
@@ -2594,7 +3017,7 @@ index 0000000..f5f71e4
+ * As above explained, 'punish' slow (i.e., seeky), timed-out
+ * and async queues, to favor sequential sync workloads.
+ *
-+ * Processes doing IO in the slower disk zones will tend to be
++ * Processes doing I/O in the slower disk zones will tend to be
+ * slow(er) even if not seeky. Hence, since the estimated peak
+ * rate is actually an average over the disk surface, these
+ * processes may timeout just for bad luck. To avoid punishing
@@ -2607,19 +3030,31 @@ index 0000000..f5f71e4
+
+ bfqq->service_from_backlogged += bfqq->entity.service;
+
-+ if (bfqd->low_latency && bfqq->raising_coeff == 1)
-+ bfqq->last_rais_start_finish = jiffies;
++ if (BFQQ_SEEKY(bfqq) && reason == BFQ_BFQQ_BUDGET_TIMEOUT &&
++ !bfq_bfqq_constantly_seeky(bfqq)) {
++ bfq_mark_bfqq_constantly_seeky(bfqq);
++ if (!blk_queue_nonrot(bfqd->queue))
++ bfqd->const_seeky_busy_in_flight_queues++;
++ }
++
++ if (reason == BFQ_BFQQ_TOO_IDLE &&
++ bfqq->entity.service <= 2 * bfqq->entity.budget / 10 )
++ bfq_clear_bfqq_IO_bound(bfqq);
++
++ if (bfqd->low_latency && bfqq->wr_coeff == 1)
++ bfqq->last_wr_start_finish = jiffies;
+
-+ if (bfqd->low_latency && bfqd->bfq_raising_max_softrt_rate > 0 &&
++ if (bfqd->low_latency && bfqd->bfq_wr_max_softrt_rate > 0 &&
+ RB_EMPTY_ROOT(&bfqq->sort_list)) {
+ /*
+ * If we get here, and there are no outstanding requests,
+ * then the request pattern is isochronous (see the comments
-+ * to the function bfq_bfqq_softrt_next_start()). Hence we can
-+ * compute soft_rt_next_start. If, instead, the queue still
-+ * has outstanding requests, then we have to wait for the
-+ * completion of all the outstanding requests to discover
-+ * whether the request pattern is actually isochronous.
++ * to the function bfq_bfqq_softrt_next_start()). Hence we
++ * can compute soft_rt_next_start. If, instead, the queue
++ * still has outstanding requests, then we have to wait
++ * for the completion of all the outstanding requests to
++ * discover whether the request pattern is actually
++ * isochronous.
+ */
+ if (bfqq->dispatched == 0)
+ bfqq->soft_rt_next_start =
@@ -2651,10 +3086,13 @@ index 0000000..f5f71e4
+ }
+
+ bfq_log_bfqq(bfqd, bfqq,
-+ "expire (%d, slow %d, num_disp %d, idle_win %d)", reason, slow,
-+ bfqq->dispatched, bfq_bfqq_idle_window(bfqq));
++ "expire (%d, slow %d, num_disp %d, idle_win %d)", reason,
++ slow, bfqq->dispatched, bfq_bfqq_idle_window(bfqq));
+
-+ /* Increase, decrease or leave budget unchanged according to reason */
++ /*
++ * Increase, decrease or leave budget unchanged according to
++ * reason.
++ */
+ __bfq_bfqq_recalc_budget(bfqd, bfqq, reason);
+ __bfq_bfqq_expire(bfqd, bfqq);
+}
@@ -2666,12 +3104,9 @@ index 0000000..f5f71e4
+ */
+static int bfq_bfqq_budget_timeout(struct bfq_queue *bfqq)
+{
-+ if (bfq_bfqq_budget_new(bfqq))
-+ return 0;
-+
-+ if (time_before(jiffies, bfqq->budget_timeout))
++ if (bfq_bfqq_budget_new(bfqq) ||
++ time_before(jiffies, bfqq->budget_timeout))
+ return 0;
-+
+ return 1;
+}
+
@@ -2686,7 +3121,7 @@ index 0000000..f5f71e4
+static inline int bfq_may_expire_for_budg_timeout(struct bfq_queue *bfqq)
+{
+ bfq_log_bfqq(bfqq->bfqd, bfqq,
-+ "may_budget_timeout: wr %d left %d timeout %d",
++ "may_budget_timeout: wait_request %d left %d timeout %d",
+ bfq_bfqq_wait_request(bfqq),
+ bfq_bfqq_budget_left(bfqq) >= bfqq->entity.budget / 3,
+ bfq_bfqq_budget_timeout(bfqq));
@@ -2698,79 +3133,190 @@ index 0000000..f5f71e4
+}
+
+/*
-+ * For weight-raised queues issuing sync requests, idling is always performed,
-+ * as this is instrumental in guaranteeing a high fraction of the throughput
-+ * to these queues, and hence in guaranteeing a lower latency for their
-+ * requests. See [1] for details.
++ * Device idling is allowed only for the queues for which this function
++ * returns true. For this reason, the return value of this function plays a
++ * critical role for both throughput boosting and service guarantees. The
++ * return value is computed through a logical expression. In this rather
++ * long comment, we try to briefly describe all the details and motivations
++ * behind the components of this logical expression.
++ *
++ * First, the expression is false if bfqq is not sync, or if: bfqq happened
++ * to become active during a large burst of queue activations, and the
++ * pattern of requests bfqq contains boosts the throughput if bfqq is
++ * expired. In fact, queues that became active during a large burst benefit
++ * only from throughput, as discussed in the comments to bfq_handle_burst.
++ * In this respect, expiring bfqq certainly boosts the throughput on NCQ-
++ * capable flash-based devices, whereas, on rotational devices, it boosts
++ * the throughput only if bfqq contains random requests.
++ *
++ * On the opposite end, if (a) bfqq is sync, (b) the above burst-related
++ * condition does not hold, and (c) bfqq is being weight-raised, then the
++ * expression always evaluates to true, as device idling is instrumental
++ * for preserving low-latency guarantees (see [1]). If, instead, conditions
++ * (a) and (b) do hold, but (c) does not, then the expression evaluates to
++ * true only if: (1) bfqq is I/O-bound and has a non-null idle window, and
++ * (2) at least one of the following two conditions holds.
++ * The first condition is that the device is not performing NCQ, because
++ * idling the device most certainly boosts the throughput if this condition
++ * holds and bfqq is I/O-bound and has been granted a non-null idle window.
++ * The second compound condition is made of the logical AND of two components.
++ *
++ * The first component is true only if there is no weight-raised busy
++ * queue. This guarantees that the device is not idled for a sync non-
++ * weight-raised queue when there are busy weight-raised queues. The former
++ * is then expired immediately if empty. Combined with the timestamping
++ * rules of BFQ (see [1] for details), this causes sync non-weight-raised
++ * queues to get a lower number of requests served, and hence to ask for a
++ * lower number of requests from the request pool, before the busy weight-
++ * raised queues get served again.
++ *
++ * This is beneficial for the processes associated with weight-raised
++ * queues, when the request pool is saturated (e.g., in the presence of
++ * write hogs). In fact, if the processes associated with the other queues
++ * ask for requests at a lower rate, then weight-raised processes have a
++ * higher probability to get a request from the pool immediately (or at
++ * least soon) when they need one. Hence they have a higher probability to
++ * actually get a fraction of the disk throughput proportional to their
++ * high weight. This is especially true with NCQ-capable drives, which
++ * enqueue several requests in advance and further reorder internally-
++ * queued requests.
++ *
++ * In the end, mistreating non-weight-raised queues when there are busy
++ * weight-raised queues seems to mitigate starvation problems in the
++ * presence of heavy write workloads and NCQ, and hence to guarantee a
++ * higher application and system responsiveness in these hostile scenarios.
++ *
++ * If the first component of the compound condition is instead true, i.e.,
++ * there is no weight-raised busy queue, then the second component of the
++ * compound condition takes into account service-guarantee and throughput
++ * issues related to NCQ (recall that the compound condition is evaluated
++ * only if the device is detected as supporting NCQ).
++ *
++ * As for service guarantees, allowing the drive to enqueue more than one
++ * request at a time, and hence delegating de facto final scheduling
++ * decisions to the drive's internal scheduler, causes loss of control on
++ * the actual request service order. In this respect, when the drive is
++ * allowed to enqueue more than one request at a time, the service
++ * distribution enforced by the drive's internal scheduler is likely to
++ * coincide with the desired device-throughput distribution only in the
++ * following, perfectly symmetric, scenario:
++ * 1) all active queues have the same weight,
++ * 2) all active groups at the same level in the groups tree have the same
++ * weight,
++ * 3) all active groups at the same level in the groups tree have the same
++ * number of children.
++ *
++ * Even in such a scenario, sequential I/O may still receive a preferential
++ * treatment, but this is not likely to be a big issue with flash-based
++ * devices, because of their non-dramatic loss of throughput with random
++ * I/O. Things do differ with HDDs, for which additional care is taken, as
++ * explained after completing the discussion for flash-based devices.
+ *
-+ * For non-weight-raised queues, idling is instead disabled if the device is
-+ * NCQ-enabled and non-rotational, as this boosts the throughput on such
-+ * devices.
++ * Unfortunately, keeping the necessary state for evaluating exactly the
++ * above symmetry conditions would be quite complex and time-consuming.
++ * Therefore BFQ evaluates instead the following stronger sub-conditions,
++ * for which it is much easier to maintain the needed state:
++ * 1) all active queues have the same weight,
++ * 2) all active groups have the same weight,
++ * 3) all active groups have at most one active child each.
++ * In particular, the last two conditions are always true if hierarchical
++ * support and the cgroups interface are not enabled, hence no state needs
++ * to be maintained in this case.
++ *
++ * According to the above considerations, the second component of the
++ * compound condition evaluates to true if any of the above symmetry
++ * sub-condition does not hold, or the device is not flash-based. Therefore,
++ * if also the first component is true, then idling is allowed for a sync
++ * queue. These are the only sub-conditions considered if the device is
++ * flash-based, as, for such a device, it is sensible to force idling only
++ * for service-guarantee issues. In fact, as for throughput, idling
++ * NCQ-capable flash-based devices would not boost the throughput even
++ * with sequential I/O; rather it would lower the throughput in proportion
++ * to how fast the device is. In the end, (only) if all the three
++ * sub-conditions hold and the device is flash-based, the compound
++ * condition evaluates to false and therefore no idling is performed.
++ *
++ * As already said, things change with a rotational device, where idling
++ * boosts the throughput with sequential I/O (even with NCQ). Hence, for
++ * such a device the second component of the compound condition evaluates
++ * to true also if the following additional sub-condition does not hold:
++ * the queue is constantly seeky. Unfortunately, this different behavior
++ * with respect to flash-based devices causes an additional asymmetry: if
++ * some sync queues enjoy idling and some other sync queues do not, then
++ * the latter get a low share of the device throughput, simply because the
++ * former get many requests served after being set as in service, whereas
++ * the latter do not. As a consequence, to guarantee the desired throughput
++ * distribution, on HDDs the compound expression evaluates to true (and
++ * hence device idling is performed) also if the following last symmetry
++ * condition does not hold: no other queue is benefiting from idling. Also
++ * this last condition is actually replaced with a simpler-to-maintain and
++ * stronger condition: there is no busy queue which is not constantly seeky
++ * (and hence may also benefit from idling).
++ *
++ * To sum up, when all the required symmetry and throughput-boosting
++ * sub-conditions hold, the second component of the compound condition
++ * evaluates to false, and hence no idling is performed. This helps to
++ * keep the drives' internal queues full on NCQ-capable devices, and hence
++ * to boost the throughput, without causing 'almost' any loss of service
++ * guarantees. The 'almost' follows from the fact that, if the internal
++ * queue of one such device is filled while all the sub-conditions hold,
++ * but at some point in time some sub-condition stops to hold, then it may
++ * become impossible to let requests be served in the new desired order
++ * until all the requests already queued in the device have been served.
+ */
+static inline bool bfq_bfqq_must_not_expire(struct bfq_queue *bfqq)
+{
+ struct bfq_data *bfqd = bfqq->bfqd;
++#ifdef CONFIG_CGROUP_BFQIO
++#define symmetric_scenario (!bfqd->active_numerous_groups && \
++ !bfq_differentiated_weights(bfqd))
++#else
++#define symmetric_scenario (!bfq_differentiated_weights(bfqd))
++#endif
++#define cond_for_seeky_on_ncq_hdd (bfq_bfqq_constantly_seeky(bfqq) && \
++ bfqd->busy_in_flight_queues == \
++ bfqd->const_seeky_busy_in_flight_queues)
++
++#define cond_for_expiring_in_burst (bfq_bfqq_in_large_burst(bfqq) && \
++ bfqd->hw_tag && \
++ (blk_queue_nonrot(bfqd->queue) || \
++ bfq_bfqq_constantly_seeky(bfqq)))
+
-+ return bfq_bfqq_sync(bfqq) && (
-+ bfqq->raising_coeff > 1 ||
-+ (bfq_bfqq_idle_window(bfqq) &&
-+ !(bfqd->hw_tag &&
-+ (blk_queue_nonrot(bfqd->queue) ||
-+ /*
-+ * If there are weight-raised busy queues, then do not idle
-+ * the disk for a sync non-weight-raised queue, and hence
-+ * expire the queue immediately if empty. Combined with the
-+ * timestamping rules of BFQ (see [1] for details), this
-+ * causes sync non-weight-raised queues to get a lower
-+ * fraction of the disk throughput, and hence reduces the rate
-+ * at which the processes associated to these queues ask for
-+ * requests from the request pool.
-+ *
-+ * This is beneficial for weight-raised processes, when the
-+ * system operates in request-pool saturation conditions
-+ * (e.g., in the presence of write hogs). In fact, if
-+ * non-weight-raised processes ask for requests at a lower
-+ * rate, then weight-raised processes have a higher
-+ * probability to get a request from the pool immediately
-+ * (or at least soon) when they need one. Hence they have a
-+ * higher probability to actually get a fraction of the disk
-+ * throughput proportional to their high weight. This is
-+ * especially true with NCQ-enabled drives, which enqueue
-+ * several requests in advance and further reorder
-+ * internally-queued requests.
-+ *
-+ * Mistreating non-weight-raised queues in the above-described
-+ * way, when there are busy weight-raised queues, seems to
-+ * mitigate starvation problems in the presence of heavy write
-+ * workloads and NCQ, and hence to guarantee a higher
-+ * application and system responsiveness in these hostile
-+ * scenarios.
-+ */
-+ bfqd->raised_busy_queues > 0)
-+ )
-+ )
++/*
++ * Condition for expiring a non-weight-raised queue (and hence not idling
++ * the device).
++ */
++#define cond_for_expiring_non_wr (bfqd->hw_tag && \
++ (bfqd->wr_busy_queues > 0 || \
++ (symmetric_scenario && \
++ (blk_queue_nonrot(bfqd->queue) || \
++ cond_for_seeky_on_ncq_hdd))))
++
++ return bfq_bfqq_sync(bfqq) &&
++ !cond_for_expiring_in_burst &&
++ (bfqq->wr_coeff > 1 ||
++ (bfq_bfqq_IO_bound(bfqq) && bfq_bfqq_idle_window(bfqq) &&
++ !cond_for_expiring_non_wr)
+ );
+}
+
+/*
-+ * If the in-service queue is empty, but it is sync and either of the following
-+ * conditions holds, then: 1) the queue must remain in service and cannot be
-+ * expired, and 2) the disk must be idled to wait for the possible arrival
-+ * of a new request for the queue. The conditions are:
-+ * - the device is rotational and not performing NCQ, and the queue has its
-+ * idle window set (in this case, waiting for a new request for the queue
-+ * is likely to boost the disk throughput);
-+ * - the queue is weight-raised (waiting for the request is necessary to
-+ * provide the queue with fairness and latency guarantees, see [1] for
-+ * details).
++ * If the in-service queue is empty but sync, and the function
++ * bfq_bfqq_must_not_expire returns true, then:
++ * 1) the queue must remain in service and cannot be expired, and
++ * 2) the disk must be idled to wait for the possible arrival of a new
++ * request for the queue.
++ * See the comments to the function bfq_bfqq_must_not_expire for the reasons
++ * why performing device idling is the best choice to boost the throughput
++ * and preserve service guarantees when bfq_bfqq_must_not_expire itself
++ * returns true.
+ */
+static inline bool bfq_bfqq_must_idle(struct bfq_queue *bfqq)
+{
+ struct bfq_data *bfqd = bfqq->bfqd;
+
+ return RB_EMPTY_ROOT(&bfqq->sort_list) && bfqd->bfq_slice_idle != 0 &&
-+ bfq_bfqq_must_not_expire(bfqq) &&
-+ !bfq_queue_nonrot_noidle(bfqd, bfqq);
++ bfq_bfqq_must_not_expire(bfqq);
+}
+
+/*
@@ -2817,16 +3363,18 @@ index 0000000..f5f71e4
+ goto expire;
+ } else {
+ /*
-+ * The idle timer may be pending because we may not
-+ * disable disk idling even when a new request arrives
++ * The idle timer may be pending because we may
++ * not disable disk idling even when a new request
++ * arrives.
+ */
+ if (timer_pending(&bfqd->idle_slice_timer)) {
+ /*
+ * If we get here: 1) at least a new request
+ * has arrived but we have not disabled the
+ * timer because the request was too small,
-+ * 2) then the block layer has unplugged the
-+ * device, causing the dispatch to be invoked.
++ * 2) then the block layer has unplugged
++ * the device, causing the dispatch to be
++ * invoked.
+ *
+ * Since the device is unplugged, now the
+ * requests are probably large enough to
@@ -2844,9 +3392,9 @@ index 0000000..f5f71e4
+ }
+
+ /*
-+ * No requests pending. If the in-service queue has no cooperator and
-+ * still has requests in flight (possibly waiting for a completion)
-+ * or is idling for a new request, then keep it.
++ * No requests pending. If the in-service queue still has requests
++ * in flight (possibly waiting for a completion) or is idling for a
++ * new request, then keep it.
+ */
+ if (new_bfqq == NULL && (timer_pending(&bfqd->idle_slice_timer) ||
+ (bfqq->dispatched != 0 && bfq_bfqq_must_not_expire(bfqq)))) {
@@ -2872,40 +3420,38 @@ index 0000000..f5f71e4
+ return bfqq;
+}
+
-+static void bfq_update_raising_data(struct bfq_data *bfqd,
-+ struct bfq_queue *bfqq)
++static void bfq_update_wr_data(struct bfq_data *bfqd,
++ struct bfq_queue *bfqq)
+{
-+ if (bfqq->raising_coeff > 1) { /* queue is being boosted */
++ if (bfqq->wr_coeff > 1) { /* queue is being boosted */
+ struct bfq_entity *entity = &bfqq->entity;
+
+ bfq_log_bfqq(bfqd, bfqq,
-+ "raising period dur %u/%u msec, "
-+ "old raising coeff %u, w %d(%d)",
++ "raising period dur %u/%u msec, old coeff %u, w %d(%d)",
+ jiffies_to_msecs(jiffies -
-+ bfqq->last_rais_start_finish),
-+ jiffies_to_msecs(bfqq->raising_cur_max_time),
-+ bfqq->raising_coeff,
++ bfqq->last_wr_start_finish),
++ jiffies_to_msecs(bfqq->wr_cur_max_time),
++ bfqq->wr_coeff,
+ bfqq->entity.weight, bfqq->entity.orig_weight);
+
+ BUG_ON(bfqq != bfqd->in_service_queue && entity->weight !=
-+ entity->orig_weight * bfqq->raising_coeff);
++ entity->orig_weight * bfqq->wr_coeff);
+ if (entity->ioprio_changed)
-+ bfq_log_bfqq(bfqd, bfqq,
-+ "WARN: pending prio change");
++ bfq_log_bfqq(bfqd, bfqq, "WARN: pending prio change");
+ /*
-+ * If too much time has elapsed from the beginning
-+ * of this weight-raising, stop it.
++ * If the queue was activated in a burst, or
++ * too much time has elapsed from the beginning
++ * of this weight-raising, then end weight raising.
+ */
-+ if (time_is_before_jiffies(bfqq->last_rais_start_finish +
-+ bfqq->raising_cur_max_time)) {
-+ bfqq->last_rais_start_finish = jiffies;
++ if (bfq_bfqq_in_large_burst(bfqq) ||
++ time_is_before_jiffies(bfqq->last_wr_start_finish +
++ bfqq->wr_cur_max_time)) {
++ bfqq->last_wr_start_finish = jiffies;
+ bfq_log_bfqq(bfqd, bfqq,
-+ "wrais ending at %lu, "
-+ "rais_max_time %u",
-+ bfqq->last_rais_start_finish,
-+ jiffies_to_msecs(bfqq->
-+ raising_cur_max_time));
-+ bfq_bfqq_end_raising(bfqq);
++ "wrais ending at %lu, rais_max_time %u",
++ bfqq->last_wr_start_finish,
++ jiffies_to_msecs(bfqq->wr_cur_max_time));
++ bfq_bfqq_end_wr(bfqq);
+ __bfq_entity_update_weight_prio(
+ bfq_entity_service_tree(entity),
+ entity);
@@ -2934,20 +3480,18 @@ index 0000000..f5f71e4
+
+ if (service_to_charge > bfq_bfqq_budget_left(bfqq)) {
+ /*
-+ * This may happen if the next rq is chosen
-+ * in fifo order instead of sector order.
-+ * The budget is properly dimensioned
-+ * to be always sufficient to serve the next request
-+ * only if it is chosen in sector order. The reason is
-+ * that it would be quite inefficient and little useful
-+ * to always make sure that the budget is large enough
-+ * to serve even the possible next rq in fifo order.
++ * This may happen if the next rq is chosen in fifo order
++ * instead of sector order. The budget is properly
++ * dimensioned to be always sufficient to serve the next
++ * request only if it is chosen in sector order. The reason
++ * is that it would be quite inefficient and little useful
++ * to always make sure that the budget is large enough to
++ * serve even the possible next rq in fifo order.
+ * In fact, requests are seldom served in fifo order.
+ *
-+ * Expire the queue for budget exhaustion, and
-+ * make sure that the next act_budget is enough
-+ * to serve the next request, even if it comes
-+ * from the fifo expired path.
++ * Expire the queue for budget exhaustion, and make sure
++ * that the next act_budget is enough to serve the next
++ * request, even if it comes from the fifo expired path.
+ */
+ bfqq->next_rq = rq;
+ /*
@@ -2963,7 +3507,7 @@ index 0000000..f5f71e4
+ bfq_bfqq_served(bfqq, service_to_charge);
+ bfq_dispatch_insert(bfqd->queue, rq);
+
-+ bfq_update_raising_data(bfqd, bfqq);
++ bfq_update_wr_data(bfqd, bfqq);
+
+ bfq_log_bfqq(bfqd, bfqq,
+ "dispatched %u sec req (%llu), budg left %lu",
@@ -3004,8 +3548,8 @@ index 0000000..f5f71e4
+}
+
+/*
-+ * Drain our current requests. Used for barriers and when switching
-+ * io schedulers on-the-fly.
++ * Drain our current requests.
++ * Used for barriers and when switching io schedulers on-the-fly.
+ */
+static int bfq_forced_dispatch(struct bfq_data *bfqd)
+{
@@ -3105,6 +3649,17 @@ index 0000000..f5f71e4
+ BUG_ON(bfq_bfqq_busy(bfqq));
+ BUG_ON(bfqd->in_service_queue == bfqq);
+
++ if (bfq_bfqq_sync(bfqq))
++ /*
++ * The fact that this queue is being destroyed does not
++ * invalidate the fact that this queue may have been
++ * activated during the current burst. As a consequence,
++ * although the queue does not exist anymore, and hence
++ * needs to be removed from the burst list if there,
++ * the burst size has not to be decremented.
++ */
++ hlist_del_init(&bfqq->burst_list_node);
++
+ bfq_log_bfqq(bfqd, bfqq, "put_queue: %p freed", bfqq);
+
+ kmem_cache_free(bfq_pool, bfqq);
@@ -3121,10 +3676,8 @@ index 0000000..f5f71e4
+ */
+ __bfqq = bfqq->new_bfqq;
+ while (__bfqq) {
-+ if (__bfqq == bfqq) {
-+ WARN(1, "bfqq->new_bfqq loop detected.\n");
++ if (__bfqq == bfqq)
+ break;
-+ }
+ next = __bfqq->new_bfqq;
+ bfq_put_queue(__bfqq);
+ __bfqq = next;
@@ -3146,7 +3699,7 @@ index 0000000..f5f71e4
+ bfq_put_queue(bfqq);
+}
+
-+static void bfq_init_icq(struct io_cq *icq)
++static inline void bfq_init_icq(struct io_cq *icq)
+{
+ struct bfq_io_cq *bic = icq_to_bic(icq);
+
@@ -3185,7 +3738,7 @@ index 0000000..f5f71e4
+ switch (ioprio_class) {
+ default:
+ dev_err(bfqq->bfqd->queue->backing_dev_info.dev,
-+ "bfq: bad prio %x\n", ioprio_class);
++ "bfq: bad prio class %d\n", ioprio_class);
+ case IOPRIO_CLASS_NONE:
+ /*
+ * No prio set, inherit CPU scheduling settings.
@@ -3208,13 +3761,15 @@ index 0000000..f5f71e4
+ break;
+ }
+
++ if (bfqq->entity.new_ioprio < 0 ||
++ bfqq->entity.new_ioprio >= IOPRIO_BE_NR) {
++ printk(KERN_CRIT "bfq_init_prio_data: new_ioprio %d\n",
++ bfqq->entity.new_ioprio);
++ BUG();
++ }
++
+ bfqq->entity.ioprio_changed = 1;
+
-+ /*
-+ * Keep track of original prio settings in case we have to temporarily
-+ * elevate the priority of this queue.
-+ */
-+ bfqq->org_ioprio = bfqq->entity.new_ioprio;
+ bfq_clear_bfqq_prio_changed(bfqq);
+}
+
@@ -3229,8 +3784,8 @@ index 0000000..f5f71e4
+ bfqd = bfq_get_bfqd_locked(&(bic->icq.q->elevator->elevator_data),
+ &flags);
+ /*
-+ * This condition may trigger on a newly created bic, be sure to drop
-+ * the lock before returning.
++ * This condition may trigger on a newly created bic, be sure to
++ * drop the lock before returning.
+ */
+ if (unlikely(bfqd == NULL) || likely(bic->ioprio == ioprio))
+ goto out;
@@ -3265,6 +3820,7 @@ index 0000000..f5f71e4
+{
+ RB_CLEAR_NODE(&bfqq->entity.rb_node);
+ INIT_LIST_HEAD(&bfqq->fifo);
++ INIT_HLIST_NODE(&bfqq->burst_list_node);
+
+ atomic_set(&bfqq->ref, 0);
+ bfqq->bfqd = bfqd;
@@ -3276,13 +3832,14 @@ index 0000000..f5f71e4
+ bfq_mark_bfqq_idle_window(bfqq);
+ bfq_mark_bfqq_sync(bfqq);
+ }
++ bfq_mark_bfqq_IO_bound(bfqq);
+
+ /* Tentative initial value to trade off between thr and lat */
+ bfqq->max_budget = (2 * bfq_max_budget(bfqd)) / 3;
+ bfqq->pid = pid;
+
-+ bfqq->raising_coeff = 1;
-+ bfqq->last_rais_start_finish = 0;
++ bfqq->wr_coeff = 1;
++ bfqq->last_wr_start_finish = 0;
+ /*
+ * Set to the value for which bfqq will not be deemed as
+ * soft rt when it becomes backlogged.
@@ -3327,14 +3884,13 @@ index 0000000..f5f71e4
+
+ if (bfqq != NULL) {
+ bfq_init_bfqq(bfqd, bfqq, current->pid, is_sync);
++ bfq_init_prio_data(bfqq, bic);
++ bfq_init_entity(&bfqq->entity, bfqg);
+ bfq_log_bfqq(bfqd, bfqq, "allocated");
+ } else {
+ bfqq = &bfqd->oom_bfqq;
+ bfq_log_bfqq(bfqd, bfqq, "using oom bfqq");
+ }
-+
-+ bfq_init_prio_data(bfqq, bic);
-+ bfq_init_entity(&bfqq->entity, bfqg);
+ }
+
+ if (new_bfqq != NULL)
@@ -3381,7 +3937,8 @@ index 0000000..f5f71e4
+ bfqq = bfq_find_alloc_queue(bfqd, bfqg, is_sync, bic, gfp_mask);
+
+ /*
-+ * Pin the queue now that it's allocated, scheduler exit will prune it.
++ * Pin the queue now that it's allocated, scheduler exit will
++ * prune it.
+ */
+ if (!is_sync && *async_bfqq == NULL) {
+ atomic_inc(&bfqq->ref);
@@ -3460,11 +4017,11 @@ index 0000000..f5f71e4
+ if (atomic_read(&bic->icq.ioc->active_ref) == 0 ||
+ bfqd->bfq_slice_idle == 0 ||
+ (bfqd->hw_tag && BFQQ_SEEKY(bfqq) &&
-+ bfqq->raising_coeff == 1))
++ bfqq->wr_coeff == 1))
+ enable_idle = 0;
+ else if (bfq_sample_valid(bic->ttime.ttime_samples)) {
+ if (bic->ttime.ttime_mean > bfqd->bfq_slice_idle &&
-+ bfqq->raising_coeff == 1)
++ bfqq->wr_coeff == 1)
+ enable_idle = 0;
+ else
+ enable_idle = 1;
@@ -3492,6 +4049,13 @@ index 0000000..f5f71e4
+
+ bfq_update_io_thinktime(bfqd, bic);
+ bfq_update_io_seektime(bfqd, bfqq, rq);
++ if (!BFQQ_SEEKY(bfqq) && bfq_bfqq_constantly_seeky(bfqq)) {
++ bfq_clear_bfqq_constantly_seeky(bfqq);
++ if (!blk_queue_nonrot(bfqd->queue)) {
++ BUG_ON(!bfqd->const_seeky_busy_in_flight_queues);
++ bfqd->const_seeky_busy_in_flight_queues--;
++ }
++ }
+ if (bfqq->entity.service > bfq_max_budget(bfqd) / 8 ||
+ !BFQQ_SEEKY(bfqq))
+ bfq_update_idle_window(bfqd, bfqq, bic);
@@ -3560,7 +4124,7 @@ index 0000000..f5f71e4
+ assert_spin_locked(bfqd->queue->queue_lock);
+ bfq_init_prio_data(bfqq, RQ_BIC(rq));
+
-+ bfq_add_rq_rb(rq);
++ bfq_add_request(rq);
+
+ rq_set_fifo_time(rq, jiffies + bfqd->bfq_fifo_expire[rq_is_sync(rq)]);
+ list_add_tail(&rq->queuelist, &bfqq->fifo);
@@ -3597,23 +4161,36 @@ index 0000000..f5f71e4
+{
+ struct bfq_queue *bfqq = RQ_BFQQ(rq);
+ struct bfq_data *bfqd = bfqq->bfqd;
-+ const int sync = rq_is_sync(rq);
++ bool sync = bfq_bfqq_sync(bfqq);
+
-+ bfq_log_bfqq(bfqd, bfqq, "completed %u sects req (%d)",
-+ blk_rq_sectors(rq), sync);
++ bfq_log_bfqq(bfqd, bfqq, "completed one req with %u sects left (%d)",
++ blk_rq_sectors(rq), sync);
+
+ bfq_update_hw_tag(bfqd);
+
-+ WARN_ON(!bfqd->rq_in_driver);
-+ WARN_ON(!bfqq->dispatched);
++ BUG_ON(!bfqd->rq_in_driver);
++ BUG_ON(!bfqq->dispatched);
+ bfqd->rq_in_driver--;
+ bfqq->dispatched--;
+
-+ if (bfq_bfqq_sync(bfqq))
-+ bfqd->sync_flight--;
++ if (!bfqq->dispatched && !bfq_bfqq_busy(bfqq)) {
++ bfq_weights_tree_remove(bfqd, &bfqq->entity,
++ &bfqd->queue_weights_tree);
++ if (!blk_queue_nonrot(bfqd->queue)) {
++ BUG_ON(!bfqd->busy_in_flight_queues);
++ bfqd->busy_in_flight_queues--;
++ if (bfq_bfqq_constantly_seeky(bfqq)) {
++ BUG_ON(!bfqd->
++ const_seeky_busy_in_flight_queues);
++ bfqd->const_seeky_busy_in_flight_queues--;
++ }
++ }
++ }
+
-+ if (sync)
++ if (sync) {
++ bfqd->sync_flight--;
+ RQ_BIC(rq)->ttime.last_end_request = jiffies;
++ }
+
+ /*
+ * If we are waiting to discover whether the request pattern of the
@@ -3673,9 +4250,9 @@ index 0000000..f5f71e4
+
+ /*
+ * Don't force setup of a queue from here, as a call to may_queue
-+ * does not necessarily imply that a request actually will be queued.
-+ * So just lookup a possibly existing queue, or return 'may queue'
-+ * if that fails.
++ * does not necessarily imply that a request actually will be
++ * queued. So just lookup a possibly existing queue, or return
++ * 'may queue' if that fails.
+ */
+ bic = bfq_bic_lookup(bfqd, tsk->io_context);
+ if (bic == NULL)
@@ -3844,12 +4421,12 @@ index 0000000..f5f71e4
+
+ bfqq = bfqd->in_service_queue;
+ /*
-+ * Theoretical race here: the in-service queue can be NULL or different
-+ * from the queue that was idling if the timer handler spins on
-+ * the queue_lock and a new request arrives for the current
-+ * queue and there is a full dispatch cycle that changes the
-+ * in-service queue. This can hardly happen, but in the worst case
-+ * we just expire a queue too early.
++ * Theoretical race here: the in-service queue can be NULL or
++ * different from the queue that was idling if the timer handler
++ * spins on the queue_lock and a new request arrives for the
++ * current queue and there is a full dispatch cycle that changes
++ * the in-service queue. This can hardly happen, but in the worst
++ * case we just expire a queue too early.
+ */
+ if (bfqq != NULL) {
+ bfq_log_bfqq(bfqd, bfqq, "slice_timer expired");
@@ -3863,9 +4440,9 @@ index 0000000..f5f71e4
+ else if (bfqq->queued[0] == 0 && bfqq->queued[1] == 0)
+ /*
+ * The queue may not be empty upon timer expiration,
-+ * because we may not disable the timer when the first
-+ * request of the in-service queue arrives during
-+ * disk idling
++ * because we may not disable the timer when the
++ * first request of the in-service queue arrives
++ * during disk idling.
+ */
+ reason = BFQ_BFQQ_TOO_IDLE;
+ else
@@ -3970,6 +4547,14 @@ index 0000000..f5f71e4
+ */
+ bfq_init_bfqq(bfqd, &bfqd->oom_bfqq, 1, 0);
+ atomic_inc(&bfqd->oom_bfqq.ref);
++ bfqd->oom_bfqq.entity.new_ioprio = BFQ_DEFAULT_QUEUE_IOPRIO;
++ bfqd->oom_bfqq.entity.new_ioprio_class = IOPRIO_CLASS_BE;
++ /*
++ * Trigger weight initialization, according to ioprio, at the
++ * oom_bfqq's first activation. The oom_bfqq's ioprio and ioprio
++ * class won't be changed any more.
++ */
++ bfqd->oom_bfqq.entity.ioprio_changed = 1;
+
+ bfqd->queue = q;
+
@@ -3985,17 +4570,24 @@ index 0000000..f5f71e4
+ }
+
+ bfqd->root_group = bfqg;
++ bfq_init_entity(&bfqd->oom_bfqq.entity, bfqd->root_group);
++#ifdef CONFIG_CGROUP_BFQIO
++ bfqd->active_numerous_groups = 0;
++#endif
+
+ init_timer(&bfqd->idle_slice_timer);
+ bfqd->idle_slice_timer.function = bfq_idle_slice_timer;
+ bfqd->idle_slice_timer.data = (unsigned long)bfqd;
+
+ bfqd->rq_pos_tree = RB_ROOT;
++ bfqd->queue_weights_tree = RB_ROOT;
++ bfqd->group_weights_tree = RB_ROOT;
+
+ INIT_WORK(&bfqd->unplug_work, bfq_kick_queue);
+
+ INIT_LIST_HEAD(&bfqd->active_list);
+ INIT_LIST_HEAD(&bfqd->idle_list);
++ INIT_HLIST_HEAD(&bfqd->burst_list);
+
+ bfqd->hw_tag = -1;
+
@@ -4012,29 +4604,38 @@ index 0000000..f5f71e4
+ bfqd->bfq_timeout[BLK_RW_ASYNC] = bfq_timeout_async;
+ bfqd->bfq_timeout[BLK_RW_SYNC] = bfq_timeout_sync;
+
++ bfqd->bfq_coop_thresh = 2;
++ bfqd->bfq_failed_cooperations = 7000;
++ bfqd->bfq_requests_within_timer = 120;
++
++ bfqd->bfq_large_burst_thresh = 11;
++ bfqd->bfq_burst_interval = msecs_to_jiffies(500);
++
+ bfqd->low_latency = true;
+
-+ bfqd->bfq_raising_coeff = 20;
-+ bfqd->bfq_raising_rt_max_time = msecs_to_jiffies(300);
-+ bfqd->bfq_raising_max_time = 0;
-+ bfqd->bfq_raising_min_idle_time = msecs_to_jiffies(2000);
-+ bfqd->bfq_raising_min_inter_arr_async = msecs_to_jiffies(500);
-+ bfqd->bfq_raising_max_softrt_rate = 7000; /*
-+ * Approximate rate required
-+ * to playback or record a
-+ * high-definition compressed
-+ * video.
-+ */
-+ bfqd->raised_busy_queues = 0;
-+
-+ /* Initially estimate the device's peak rate as the reference rate */
-+ if (blk_queue_nonrot(bfqd->queue)) {
-+ bfqd->RT_prod = R_nonrot * T_nonrot;
-+ bfqd->peak_rate = R_nonrot;
-+ } else {
-+ bfqd->RT_prod = R_rot * T_rot;
-+ bfqd->peak_rate = R_rot;
-+ }
++ bfqd->bfq_wr_coeff = 20;
++ bfqd->bfq_wr_rt_max_time = msecs_to_jiffies(300);
++ bfqd->bfq_wr_max_time = 0;
++ bfqd->bfq_wr_min_idle_time = msecs_to_jiffies(2000);
++ bfqd->bfq_wr_min_inter_arr_async = msecs_to_jiffies(500);
++ bfqd->bfq_wr_max_softrt_rate = 7000; /*
++ * Approximate rate required
++ * to playback or record a
++ * high-definition compressed
++ * video.
++ */
++ bfqd->wr_busy_queues = 0;
++ bfqd->busy_in_flight_queues = 0;
++ bfqd->const_seeky_busy_in_flight_queues = 0;
++
++ /*
++ * Begin by assuming, optimistically, that the device peak rate is
++ * equal to the highest reference rate.
++ */
++ bfqd->RT_prod = R_fast[blk_queue_nonrot(bfqd->queue)] *
++ T_fast[blk_queue_nonrot(bfqd->queue)];
++ bfqd->peak_rate = R_fast[blk_queue_nonrot(bfqd->queue)];
++ bfqd->device_speed = BFQ_BFQD_FAST;
+
+ return 0;
+}
@@ -4058,7 +4659,8 @@ index 0000000..f5f71e4
+ return sprintf(page, "%d\n", var);
+}
+
-+static ssize_t bfq_var_store(unsigned long *var, const char *page, size_t count)
++static ssize_t bfq_var_store(unsigned long *var, const char *page,
++ size_t count)
+{
+ unsigned long new_val;
+ int ret = kstrtoul(page, 10, &new_val);
@@ -4069,12 +4671,12 @@ index 0000000..f5f71e4
+ return count;
+}
+
-+static ssize_t bfq_raising_max_time_show(struct elevator_queue *e, char *page)
++static ssize_t bfq_wr_max_time_show(struct elevator_queue *e, char *page)
+{
+ struct bfq_data *bfqd = e->elevator_data;
-+ return sprintf(page, "%d\n", bfqd->bfq_raising_max_time > 0 ?
-+ jiffies_to_msecs(bfqd->bfq_raising_max_time) :
-+ jiffies_to_msecs(bfq_wrais_duration(bfqd)));
++ return sprintf(page, "%d\n", bfqd->bfq_wr_max_time > 0 ?
++ jiffies_to_msecs(bfqd->bfq_wr_max_time) :
++ jiffies_to_msecs(bfq_wr_duration(bfqd)));
+}
+
+static ssize_t bfq_weights_show(struct elevator_queue *e, char *page)
@@ -4091,15 +4693,13 @@ index 0000000..f5f71e4
+ num_char += sprintf(page + num_char, "Active:\n");
+ list_for_each_entry(bfqq, &bfqd->active_list, bfqq_list) {
+ num_char += sprintf(page + num_char,
-+ "pid%d: weight %hu, nr_queued %d %d,"
-+ " dur %d/%u\n",
++ "pid%d: weight %hu, nr_queued %d %d, dur %d/%u\n",
+ bfqq->pid,
+ bfqq->entity.weight,
+ bfqq->queued[0],
+ bfqq->queued[1],
-+ jiffies_to_msecs(jiffies -
-+ bfqq->last_rais_start_finish),
-+ jiffies_to_msecs(bfqq->raising_cur_max_time));
++ jiffies_to_msecs(jiffies - bfqq->last_wr_start_finish),
++ jiffies_to_msecs(bfqq->wr_cur_max_time));
+ }
+
+ num_char += sprintf(page + num_char, "Idle:\n");
@@ -4109,8 +4709,8 @@ index 0000000..f5f71e4
+ bfqq->pid,
+ bfqq->entity.weight,
+ jiffies_to_msecs(jiffies -
-+ bfqq->last_rais_start_finish),
-+ jiffies_to_msecs(bfqq->raising_cur_max_time));
++ bfqq->last_wr_start_finish),
++ jiffies_to_msecs(bfqq->wr_cur_max_time));
+ }
+
+ spin_unlock_irq(bfqd->queue->queue_lock);
@@ -4134,19 +4734,17 @@ index 0000000..f5f71e4
+SHOW_FUNCTION(bfq_back_seek_penalty_show, bfqd->bfq_back_penalty, 0);
+SHOW_FUNCTION(bfq_slice_idle_show, bfqd->bfq_slice_idle, 1);
+SHOW_FUNCTION(bfq_max_budget_show, bfqd->bfq_user_max_budget, 0);
-+SHOW_FUNCTION(bfq_max_budget_async_rq_show, bfqd->bfq_max_budget_async_rq, 0);
++SHOW_FUNCTION(bfq_max_budget_async_rq_show,
++ bfqd->bfq_max_budget_async_rq, 0);
+SHOW_FUNCTION(bfq_timeout_sync_show, bfqd->bfq_timeout[BLK_RW_SYNC], 1);
+SHOW_FUNCTION(bfq_timeout_async_show, bfqd->bfq_timeout[BLK_RW_ASYNC], 1);
+SHOW_FUNCTION(bfq_low_latency_show, bfqd->low_latency, 0);
-+SHOW_FUNCTION(bfq_raising_coeff_show, bfqd->bfq_raising_coeff, 0);
-+SHOW_FUNCTION(bfq_raising_rt_max_time_show, bfqd->bfq_raising_rt_max_time, 1);
-+SHOW_FUNCTION(bfq_raising_min_idle_time_show, bfqd->bfq_raising_min_idle_time,
++SHOW_FUNCTION(bfq_wr_coeff_show, bfqd->bfq_wr_coeff, 0);
++SHOW_FUNCTION(bfq_wr_rt_max_time_show, bfqd->bfq_wr_rt_max_time, 1);
++SHOW_FUNCTION(bfq_wr_min_idle_time_show, bfqd->bfq_wr_min_idle_time, 1);
++SHOW_FUNCTION(bfq_wr_min_inter_arr_async_show, bfqd->bfq_wr_min_inter_arr_async,
+ 1);
-+SHOW_FUNCTION(bfq_raising_min_inter_arr_async_show,
-+ bfqd->bfq_raising_min_inter_arr_async,
-+ 1);
-+SHOW_FUNCTION(bfq_raising_max_softrt_rate_show,
-+ bfqd->bfq_raising_max_softrt_rate, 0);
++SHOW_FUNCTION(bfq_wr_max_softrt_rate_show, bfqd->bfq_wr_max_softrt_rate, 0);
+#undef SHOW_FUNCTION
+
+#define STORE_FUNCTION(__FUNC, __PTR, MIN, MAX, __CONV) \
@@ -4179,18 +4777,16 @@ index 0000000..f5f71e4
+ 1, INT_MAX, 0);
+STORE_FUNCTION(bfq_timeout_async_store, &bfqd->bfq_timeout[BLK_RW_ASYNC], 0,
+ INT_MAX, 1);
-+STORE_FUNCTION(bfq_raising_coeff_store, &bfqd->bfq_raising_coeff, 1,
-+ INT_MAX, 0);
-+STORE_FUNCTION(bfq_raising_max_time_store, &bfqd->bfq_raising_max_time, 0,
-+ INT_MAX, 1);
-+STORE_FUNCTION(bfq_raising_rt_max_time_store, &bfqd->bfq_raising_rt_max_time, 0,
++STORE_FUNCTION(bfq_wr_coeff_store, &bfqd->bfq_wr_coeff, 1, INT_MAX, 0);
++STORE_FUNCTION(bfq_wr_max_time_store, &bfqd->bfq_wr_max_time, 0, INT_MAX, 1);
++STORE_FUNCTION(bfq_wr_rt_max_time_store, &bfqd->bfq_wr_rt_max_time, 0, INT_MAX,
++ 1);
++STORE_FUNCTION(bfq_wr_min_idle_time_store, &bfqd->bfq_wr_min_idle_time, 0,
+ INT_MAX, 1);
-+STORE_FUNCTION(bfq_raising_min_idle_time_store,
-+ &bfqd->bfq_raising_min_idle_time, 0, INT_MAX, 1);
-+STORE_FUNCTION(bfq_raising_min_inter_arr_async_store,
-+ &bfqd->bfq_raising_min_inter_arr_async, 0, INT_MAX, 1);
-+STORE_FUNCTION(bfq_raising_max_softrt_rate_store,
-+ &bfqd->bfq_raising_max_softrt_rate, 0, INT_MAX, 0);
++STORE_FUNCTION(bfq_wr_min_inter_arr_async_store,
++ &bfqd->bfq_wr_min_inter_arr_async, 0, INT_MAX, 1);
++STORE_FUNCTION(bfq_wr_max_softrt_rate_store, &bfqd->bfq_wr_max_softrt_rate, 0,
++ INT_MAX, 0);
+#undef STORE_FUNCTION
+
+/* do nothing for the moment */
@@ -4259,7 +4855,7 @@ index 0000000..f5f71e4
+ if (__data > 1)
+ __data = 1;
+ if (__data == 0 && bfqd->low_latency != 0)
-+ bfq_end_raising(bfqd);
++ bfq_end_wr(bfqd);
+ bfqd->low_latency = __data;
+
+ return ret;
@@ -4280,12 +4876,12 @@ index 0000000..f5f71e4
+ BFQ_ATTR(timeout_sync),
+ BFQ_ATTR(timeout_async),
+ BFQ_ATTR(low_latency),
-+ BFQ_ATTR(raising_coeff),
-+ BFQ_ATTR(raising_max_time),
-+ BFQ_ATTR(raising_rt_max_time),
-+ BFQ_ATTR(raising_min_idle_time),
-+ BFQ_ATTR(raising_min_inter_arr_async),
-+ BFQ_ATTR(raising_max_softrt_rate),
++ BFQ_ATTR(wr_coeff),
++ BFQ_ATTR(wr_max_time),
++ BFQ_ATTR(wr_rt_max_time),
++ BFQ_ATTR(wr_min_idle_time),
++ BFQ_ATTR(wr_min_inter_arr_async),
++ BFQ_ATTR(wr_max_softrt_rate),
+ BFQ_ATTR(weights),
+ __ATTR_NULL
+};
@@ -4332,8 +4928,25 @@ index 0000000..f5f71e4
+ if (bfq_slab_setup())
+ return -ENOMEM;
+
++ /*
++ * Times to load large popular applications for the typical systems
++ * installed on the reference devices (see the comments before the
++ * definitions of the two arrays).
++ */
++ T_slow[0] = msecs_to_jiffies(2600);
++ T_slow[1] = msecs_to_jiffies(1000);
++ T_fast[0] = msecs_to_jiffies(5500);
++ T_fast[1] = msecs_to_jiffies(2000);
++
++ /*
++ * Thresholds that determine the switch between speed classes (see
++ * the comments before the definition of the array).
++ */
++ device_speed_thresh[0] = (R_fast[0] + R_slow[0]) / 2;
++ device_speed_thresh[1] = (R_fast[1] + R_slow[1]) / 2;
++
+ elv_register(&iosched_bfq);
-+ pr_info("BFQ I/O-scheduler version: v7r2");
++ pr_info("BFQ I/O-scheduler version: v7r7");
+
+ return 0;
+}
@@ -4348,12 +4961,13 @@ index 0000000..f5f71e4
+module_exit(bfq_exit);
+
+MODULE_AUTHOR("Fabio Checconi, Paolo Valente");
++MODULE_LICENSE("GPL");
diff --git a/block/bfq-sched.c b/block/bfq-sched.c
new file mode 100644
-index 0000000..999b475
+index 0000000..2931563
--- /dev/null
+++ b/block/bfq-sched.c
-@@ -0,0 +1,1078 @@
+@@ -0,0 +1,1214 @@
+/*
+ * BFQ: Hierarchical B-WF2Q+ scheduler.
+ *
@@ -4453,7 +5067,8 @@ index 0000000..999b475
+ * Shift for timestamp calculations. This actually limits the maximum
+ * service allowed in one timestamp delta (small shift values increase it),
+ * the maximum total weight that can be used for the queues in the system
-+ * (big shift values increase it), and the period of virtual time wraparounds.
++ * (big shift values increase it), and the period of virtual time
++ * wraparounds.
+ */
+#define WFQ_SERVICE_SHIFT 22
+
@@ -4685,8 +5300,18 @@ index 0000000..999b475
+ goto up;
+}
+
++static void bfq_weights_tree_add(struct bfq_data *bfqd,
++ struct bfq_entity *entity,
++ struct rb_root *root);
++
++static void bfq_weights_tree_remove(struct bfq_data *bfqd,
++ struct bfq_entity *entity,
++ struct rb_root *root);
++
++
+/**
-+ * bfq_active_insert - insert an entity in the active tree of its group/device.
++ * bfq_active_insert - insert an entity in the active tree of its
++ * group/device.
+ * @st: the service tree of the entity.
+ * @entity: the entity being inserted.
+ *
@@ -4700,6 +5325,11 @@ index 0000000..999b475
+{
+ struct bfq_queue *bfqq = bfq_entity_to_bfqq(entity);
+ struct rb_node *node = &entity->rb_node;
++#ifdef CONFIG_CGROUP_BFQIO
++ struct bfq_sched_data *sd = NULL;
++ struct bfq_group *bfqg = NULL;
++ struct bfq_data *bfqd = NULL;
++#endif
+
+ bfq_insert(&st->active, entity);
+
@@ -4710,17 +5340,36 @@ index 0000000..999b475
+
+ bfq_update_active_tree(node);
+
++#ifdef CONFIG_CGROUP_BFQIO
++ sd = entity->sched_data;
++ bfqg = container_of(sd, struct bfq_group, sched_data);
++ BUG_ON(!bfqg);
++ bfqd = (struct bfq_data *)bfqg->bfqd;
++#endif
+ if (bfqq != NULL)
+ list_add(&bfqq->bfqq_list, &bfqq->bfqd->active_list);
++#ifdef CONFIG_CGROUP_BFQIO
++ else { /* bfq_group */
++ BUG_ON(!bfqd);
++ bfq_weights_tree_add(bfqd, entity, &bfqd->group_weights_tree);
++ }
++ if (bfqg != bfqd->root_group) {
++ BUG_ON(!bfqg);
++ BUG_ON(!bfqd);
++ bfqg->active_entities++;
++ if (bfqg->active_entities == 2)
++ bfqd->active_numerous_groups++;
++ }
++#endif
+}
+
+/**
+ * bfq_ioprio_to_weight - calc a weight from an ioprio.
+ * @ioprio: the ioprio value to convert.
+ */
-+static unsigned short bfq_ioprio_to_weight(int ioprio)
++static inline unsigned short bfq_ioprio_to_weight(int ioprio)
+{
-+ WARN_ON(ioprio < 0 || ioprio >= IOPRIO_BE_NR);
++ BUG_ON(ioprio < 0 || ioprio >= IOPRIO_BE_NR);
+ return IOPRIO_BE_NR - ioprio;
+}
+
@@ -4732,19 +5381,17 @@ index 0000000..999b475
+ * 0 is used as an escape ioprio value for weights (numerically) equal or
+ * larger than IOPRIO_BE_NR
+ */
-+static unsigned short bfq_weight_to_ioprio(int weight)
++static inline unsigned short bfq_weight_to_ioprio(int weight)
+{
-+ WARN_ON(weight < BFQ_MIN_WEIGHT || weight > BFQ_MAX_WEIGHT);
++ BUG_ON(weight < BFQ_MIN_WEIGHT || weight > BFQ_MAX_WEIGHT);
+ return IOPRIO_BE_NR - weight < 0 ? 0 : IOPRIO_BE_NR - weight;
+}
+
+static inline void bfq_get_entity(struct bfq_entity *entity)
+{
+ struct bfq_queue *bfqq = bfq_entity_to_bfqq(entity);
-+ struct bfq_sched_data *sd;
+
+ if (bfqq != NULL) {
-+ sd = entity->sched_data;
+ atomic_inc(&bfqq->ref);
+ bfq_log_bfqq(bfqq->bfqd, bfqq, "get_entity: %p %d",
+ bfqq, atomic_read(&bfqq->ref));
@@ -4791,6 +5438,11 @@ index 0000000..999b475
+{
+ struct bfq_queue *bfqq = bfq_entity_to_bfqq(entity);
+ struct rb_node *node;
++#ifdef CONFIG_CGROUP_BFQIO
++ struct bfq_sched_data *sd = NULL;
++ struct bfq_group *bfqg = NULL;
++ struct bfq_data *bfqd = NULL;
++#endif
+
+ node = bfq_find_deepest(&entity->rb_node);
+ bfq_extract(&st->active, entity);
@@ -4798,8 +5450,31 @@ index 0000000..999b475
+ if (node != NULL)
+ bfq_update_active_tree(node);
+
++#ifdef CONFIG_CGROUP_BFQIO
++ sd = entity->sched_data;
++ bfqg = container_of(sd, struct bfq_group, sched_data);
++ BUG_ON(!bfqg);
++ bfqd = (struct bfq_data *)bfqg->bfqd;
++#endif
+ if (bfqq != NULL)
+ list_del(&bfqq->bfqq_list);
++#ifdef CONFIG_CGROUP_BFQIO
++ else { /* bfq_group */
++ BUG_ON(!bfqd);
++ bfq_weights_tree_remove(bfqd, entity,
++ &bfqd->group_weights_tree);
++ }
++ if (bfqg != bfqd->root_group) {
++ BUG_ON(!bfqg);
++ BUG_ON(!bfqd);
++ BUG_ON(!bfqg->active_entities);
++ bfqg->active_entities--;
++ if (bfqg->active_entities == 1) {
++ BUG_ON(!bfqd->active_numerous_groups);
++ bfqd->active_numerous_groups--;
++ }
++ }
++#endif
+}
+
+/**
@@ -4897,11 +5572,37 @@ index 0000000..999b475
+
+ if (entity->ioprio_changed) {
+ struct bfq_queue *bfqq = bfq_entity_to_bfqq(entity);
++ unsigned short prev_weight, new_weight;
++ struct bfq_data *bfqd = NULL;
++ struct rb_root *root;
++#ifdef CONFIG_CGROUP_BFQIO
++ struct bfq_sched_data *sd;
++ struct bfq_group *bfqg;
++#endif
++
++ if (bfqq != NULL)
++ bfqd = bfqq->bfqd;
++#ifdef CONFIG_CGROUP_BFQIO
++ else {
++ sd = entity->my_sched_data;
++ bfqg = container_of(sd, struct bfq_group, sched_data);
++ BUG_ON(!bfqg);
++ bfqd = (struct bfq_data *)bfqg->bfqd;
++ BUG_ON(!bfqd);
++ }
++#endif
+
+ BUG_ON(old_st->wsum < entity->weight);
+ old_st->wsum -= entity->weight;
+
+ if (entity->new_weight != entity->orig_weight) {
++ if (entity->new_weight < BFQ_MIN_WEIGHT ||
++ entity->new_weight > BFQ_MAX_WEIGHT) {
++ printk(KERN_CRIT "update_weight_prio: "
++ "new_weight %d\n",
++ entity->new_weight);
++ BUG();
++ }
+ entity->orig_weight = entity->new_weight;
+ entity->ioprio =
+ bfq_weight_to_ioprio(entity->orig_weight);
@@ -4924,8 +5625,31 @@ index 0000000..999b475
+ * when entity->finish <= old_st->vtime).
+ */
+ new_st = bfq_entity_service_tree(entity);
-+ entity->weight = entity->orig_weight *
-+ (bfqq != NULL ? bfqq->raising_coeff : 1);
++
++ prev_weight = entity->weight;
++ new_weight = entity->orig_weight *
++ (bfqq != NULL ? bfqq->wr_coeff : 1);
++ /*
++ * If the weight of the entity changes, remove the entity
++ * from its old weight counter (if there is a counter
++ * associated with the entity), and add it to the counter
++ * associated with its new weight.
++ */
++ if (prev_weight != new_weight) {
++ root = bfqq ? &bfqd->queue_weights_tree :
++ &bfqd->group_weights_tree;
++ bfq_weights_tree_remove(bfqd, entity, root);
++ }
++ entity->weight = new_weight;
++ /*
++ * Add the entity to its weights tree only if it is
++ * not associated with a weight-raised queue.
++ */
++ if (prev_weight != new_weight &&
++ (bfqq ? bfqq->wr_coeff == 1 : 1))
++ /* If we get here, root has been initialized. */
++ bfq_weights_tree_add(bfqd, entity, root);
++
+ new_st->wsum += entity->weight;
+
+ if (new_st != old_st)
@@ -4936,7 +5660,8 @@ index 0000000..999b475
+}
+
+/**
-+ * bfq_bfqq_served - update the scheduler status after selection for service.
++ * bfq_bfqq_served - update the scheduler status after selection for
++ * service.
+ * @bfqq: the queue being served.
+ * @served: bytes to transfer.
+ *
@@ -5075,7 +5800,7 @@ index 0000000..999b475
+ * and if the caller did not specify @requeue, put it on the idle tree.
+ *
+ * Return %1 if the caller should update the entity hierarchy, i.e.,
-+ * if the entity was under service or if it was the next_in_service for
++ * if the entity was in service or if it was the next_in_service for
+ * its sched_data; return %0 otherwise.
+ */
+static int __bfq_deactivate_entity(struct bfq_entity *entity, int requeue)
@@ -5131,7 +5856,7 @@ index 0000000..999b475
+ /*
+ * The parent entity is still backlogged, and
+ * we don't need to update it as it is still
-+ * under service.
++ * in service.
+ */
+ break;
+
@@ -5172,7 +5897,7 @@ index 0000000..999b475
+ * active tree of the device is not empty.
+ *
+ * NOTE: this hierarchical implementation updates vtimes quite often,
-+ * we may end up with reactivated tasks getting timestamps after a
++ * we may end up with reactivated processes getting timestamps after a
+ * vtime skip done because we needed a ->first_active entity on some
+ * intermediate node.
+ */
@@ -5195,8 +5920,8 @@ index 0000000..999b475
+ *
+ * This function searches the first schedulable entity, starting from the
+ * root of the tree and going on the left every time on this side there is
-+ * a subtree with at least one eligible (start >= vtime) entity. The path
-+ * on the right is followed only if a) the left subtree contains no eligible
++ * a subtree with at least one eligible (start >= vtime) entity. The path on
++ * the right is followed only if a) the left subtree contains no eligible
+ * entities and b) no eligible entity has been found yet.
+ */
+static struct bfq_entity *bfq_first_active_entity(struct bfq_service_tree *st)
@@ -5356,8 +6081,8 @@ index 0000000..999b475
+ bfq_update_budget(entity);
+ bfq_update_vtime(bfq_entity_service_tree(entity));
+ bfq_active_extract(bfq_entity_service_tree(entity), entity);
-+ sd->active_entity = entity;
-+ sd->next_active = NULL;
++ sd->in_service_entity = entity;
++ sd->next_in_service = NULL;
+ entity->service = 0;
+ }
+
@@ -5409,8 +6134,22 @@ index 0000000..999b475
+
+ BUG_ON(bfqd->busy_queues == 0);
+ bfqd->busy_queues--;
-+ if (bfqq->raising_coeff > 1)
-+ bfqd->raised_busy_queues--;
++
++ if (!bfqq->dispatched) {
++ bfq_weights_tree_remove(bfqd, &bfqq->entity,
++ &bfqd->queue_weights_tree);
++ if (!blk_queue_nonrot(bfqd->queue)) {
++ BUG_ON(!bfqd->busy_in_flight_queues);
++ bfqd->busy_in_flight_queues--;
++ if (bfq_bfqq_constantly_seeky(bfqq)) {
++ BUG_ON(!bfqd->
++ const_seeky_busy_in_flight_queues);
++ bfqd->const_seeky_busy_in_flight_queues--;
++ }
++ }
++ }
++ if (bfqq->wr_coeff > 1)
++ bfqd->wr_busy_queues--;
+
+ bfq_deactivate_bfqq(bfqd, bfqq, requeue);
+}
@@ -5429,17 +6168,28 @@ index 0000000..999b475
+
+ bfq_mark_bfqq_busy(bfqq);
+ bfqd->busy_queues++;
-+ if (bfqq->raising_coeff > 1)
-+ bfqd->raised_busy_queues++;
++
++ if (!bfqq->dispatched) {
++ if (bfqq->wr_coeff == 1)
++ bfq_weights_tree_add(bfqd, &bfqq->entity,
++ &bfqd->queue_weights_tree);
++ if (!blk_queue_nonrot(bfqd->queue)) {
++ bfqd->busy_in_flight_queues++;
++ if (bfq_bfqq_constantly_seeky(bfqq))
++ bfqd->const_seeky_busy_in_flight_queues++;
++ }
++ }
++ if (bfqq->wr_coeff > 1)
++ bfqd->wr_busy_queues++;
+}
diff --git a/block/bfq.h b/block/bfq.h
new file mode 100644
-index 0000000..3ca8482
+index 0000000..050eeaf
--- /dev/null
+++ b/block/bfq.h
-@@ -0,0 +1,622 @@
+@@ -0,0 +1,775 @@
+/*
-+ * BFQ-v7r2 for 3.14.0: data structures and common functions prototypes.
++ * BFQ-v7r7 for 3.14.0: data structures and common functions prototypes.
+ *
+ * Based on ideas and code from CFQ:
+ * Copyright (C) 2003 Jens Axboe <axboe@kernel.dk>
@@ -5464,6 +6214,8 @@ index 0000000..3ca8482
+#define BFQ_MIN_WEIGHT 1
+#define BFQ_MAX_WEIGHT 1000
+
++#define BFQ_DEFAULT_QUEUE_IOPRIO 4
++
+#define BFQ_DEFAULT_GRP_WEIGHT 10
+#define BFQ_DEFAULT_GRP_IOPRIO 0
+#define BFQ_DEFAULT_GRP_CLASS IOPRIO_CLASS_BE
@@ -5497,7 +6249,7 @@ index 0000000..3ca8482
+
+/**
+ * struct bfq_sched_data - multi-class scheduler.
-+ * @in_service_entity: entity under service.
++ * @in_service_entity: entity in service.
+ * @next_in_service: head-of-the-line entity in the scheduler.
+ * @service_tree: array of service trees, one per ioprio_class.
+ *
@@ -5521,8 +6273,23 @@ index 0000000..3ca8482
+};
+
+/**
++ * struct bfq_weight_counter - counter of the number of all active entities
++ * with a given weight.
++ * @weight: weight of the entities that this counter refers to.
++ * @num_active: number of active entities with this weight.
++ * @weights_node: weights tree member (see bfq_data's @queue_weights_tree
++ * and @group_weights_tree).
++ */
++struct bfq_weight_counter {
++ short int weight;
++ unsigned int num_active;
++ struct rb_node weights_node;
++};
++
++/**
+ * struct bfq_entity - schedulable entity.
+ * @rb_node: service_tree member.
++ * @weight_counter: pointer to the weight counter associated with this entity.
+ * @on_st: flag, true if the entity is on a tree (either the active or
+ * the idle one of its service_tree).
+ * @finish: B-WF2Q+ finish timestamp (aka F_i).
@@ -5573,6 +6340,7 @@ index 0000000..3ca8482
+ */
+struct bfq_entity {
+ struct rb_node rb_node;
++ struct bfq_weight_counter *weight_counter;
+
+ int on_st;
+
@@ -5618,28 +6386,34 @@ index 0000000..3ca8482
+ * @max_budget: maximum budget allowed from the feedback mechanism.
+ * @budget_timeout: budget expiration (in jiffies).
+ * @dispatched: number of requests on the dispatch list or inside driver.
-+ * @org_ioprio: saved ioprio during boosted periods.
+ * @flags: status flags.
+ * @bfqq_list: node for active/idle bfqq list inside our bfqd.
++ * @burst_list_node: node for the device's burst list.
+ * @seek_samples: number of seeks sampled
+ * @seek_total: sum of the distances of the seeks sampled
+ * @seek_mean: mean seek distance
+ * @last_request_pos: position of the last request enqueued
++ * @requests_within_timer: number of consecutive pairs of request completion
++ * and arrival, such that the queue becomes idle
++ * after the completion, but the next request arrives
++ * within an idle time slice; used only if the queue's
++ * IO_bound has been cleared.
+ * @pid: pid of the process owning the queue, used for logging purposes.
-+ * @last_rais_start_finish: start time of the current weight-raising period if
-+ * the @bfq-queue is being weight-raised, otherwise
-+ * finish time of the last weight-raising period
-+ * @raising_cur_max_time: current max raising time for this queue
-+ * @soft_rt_next_start: minimum time instant such that, only if a new request
-+ * is enqueued after this time instant in an idle
-+ * @bfq_queue with no outstanding requests, then the
-+ * task associated with the queue it is deemed as soft
-+ * real-time (see the comments to the function
-+ * bfq_bfqq_softrt_next_start())
++ * @last_wr_start_finish: start time of the current weight-raising period if
++ * the @bfq-queue is being weight-raised, otherwise
++ * finish time of the last weight-raising period
++ * @wr_cur_max_time: current max raising time for this queue
++ * @soft_rt_next_start: minimum time instant such that, only if a new
++ * request is enqueued after this time instant in an
++ * idle @bfq_queue with no outstanding requests, then
++ * the task associated with the queue it is deemed as
++ * soft real-time (see the comments to the function
++ * bfq_bfqq_softrt_next_start()).
+ * @last_idle_bklogged: time of the last transition of the @bfq_queue from
+ * idle to backlogged
+ * @service_from_backlogged: cumulative service received from the @bfq_queue
-+ * since the last transition from idle to backlogged
++ * since the last transition from idle to
++ * backlogged
+ *
+ * A bfq_queue is a leaf request queue; it can be associated with an io_context
+ * or more, if it is async or shared between cooperating processes. @cgroup
@@ -5671,24 +6445,26 @@ index 0000000..3ca8482
+
+ int dispatched;
+
-+ unsigned short org_ioprio;
-+
+ unsigned int flags;
+
+ struct list_head bfqq_list;
+
++ struct hlist_node burst_list_node;
++
+ unsigned int seek_samples;
+ u64 seek_total;
+ sector_t seek_mean;
+ sector_t last_request_pos;
+
++ unsigned int requests_within_timer;
++
+ pid_t pid;
+
+ /* weight-raising fields */
-+ unsigned long raising_cur_max_time;
++ unsigned long wr_cur_max_time;
+ unsigned long soft_rt_next_start;
-+ unsigned long last_rais_start_finish;
-+ unsigned int raising_coeff;
++ unsigned long last_wr_start_finish;
++ unsigned int wr_coeff;
+ unsigned long last_idle_bklogged;
+ unsigned long service_from_backlogged;
+};
@@ -5720,35 +6496,82 @@ index 0000000..3ca8482
+ int ioprio;
+};
+
++enum bfq_device_speed {
++ BFQ_BFQD_FAST,
++ BFQ_BFQD_SLOW,
++};
++
+/**
+ * struct bfq_data - per device data structure.
+ * @queue: request queue for the managed device.
+ * @root_group: root bfq_group for the device.
-+ * @rq_pos_tree: rbtree sorted by next_request position,
-+ * used when determining if two or more queues
-+ * have interleaving requests (see bfq_close_cooperator).
++ * @rq_pos_tree: rbtree sorted by next_request position, used when
++ * determining if two or more queues have interleaving
++ * requests (see bfq_close_cooperator()).
++ * @active_numerous_groups: number of bfq_groups containing more than one
++ * active @bfq_entity.
++ * @queue_weights_tree: rbtree of weight counters of @bfq_queues, sorted by
++ * weight. Used to keep track of whether all @bfq_queues
++ * have the same weight. The tree contains one counter
++ * for each distinct weight associated to some active
++ * and not weight-raised @bfq_queue (see the comments to
++ * the functions bfq_weights_tree_[add|remove] for
++ * further details).
++ * @group_weights_tree: rbtree of non-queue @bfq_entity weight counters, sorted
++ * by weight. Used to keep track of whether all
++ * @bfq_groups have the same weight. The tree contains
++ * one counter for each distinct weight associated to
++ * some active @bfq_group (see the comments to the
++ * functions bfq_weights_tree_[add|remove] for further
++ * details).
+ * @busy_queues: number of bfq_queues containing requests (including the
-+ * queue under service, even if it is idling).
-+ * @raised_busy_queues: number of weight-raised busy bfq_queues.
++ * queue in service, even if it is idling).
++ * @busy_in_flight_queues: number of @bfq_queues containing pending or
++ * in-flight requests, plus the @bfq_queue in
++ * service, even if idle but waiting for the
++ * possible arrival of its next sync request. This
++ * field is updated only if the device is rotational,
++ * but used only if the device is also NCQ-capable.
++ * The reason why the field is updated also for non-
++ * NCQ-capable rotational devices is related to the
++ * fact that the value of @hw_tag may be set also
++ * later than when busy_in_flight_queues may need to
++ * be incremented for the first time(s). Taking also
++ * this possibility into account, to avoid unbalanced
++ * increments/decrements, would imply more overhead
++ * than just updating busy_in_flight_queues
++ * regardless of the value of @hw_tag.
++ * @const_seeky_busy_in_flight_queues: number of constantly-seeky @bfq_queues
++ * (that is, seeky queues that expired
++ * for budget timeout at least once)
++ * containing pending or in-flight
++ * requests, including the in-service
++ * @bfq_queue if constantly seeky. This
++ * field is updated only if the device
++ * is rotational, but used only if the
++ * device is also NCQ-capable (see the
++ * comments to @busy_in_flight_queues).
++ * @wr_busy_queues: number of weight-raised busy @bfq_queues.
+ * @queued: number of queued requests.
+ * @rq_in_driver: number of requests dispatched and waiting for completion.
+ * @sync_flight: number of sync requests in the driver.
-+ * @max_rq_in_driver: max number of reqs in driver in the last @hw_tag_samples
-+ * completed requests .
++ * @max_rq_in_driver: max number of reqs in driver in the last
++ * @hw_tag_samples completed requests.
+ * @hw_tag_samples: nr of samples used to calculate hw_tag.
+ * @hw_tag: flag set to one if the driver is showing a queueing behavior.
+ * @budgets_assigned: number of budgets assigned.
+ * @idle_slice_timer: timer set when idling for the next sequential request
-+ * from the queue under service.
++ * from the queue in service.
+ * @unplug_work: delayed work to restart dispatching on the request queue.
-+ * @in_service_queue: bfq_queue under service.
++ * @in_service_queue: bfq_queue in service.
+ * @in_service_bic: bfq_io_cq (bic) associated with the @in_service_queue.
+ * @last_position: on-disk position of the last served request.
+ * @last_budget_start: beginning of the last budget.
+ * @last_idling_start: beginning of the last idle slice.
+ * @peak_rate: peak transfer rate observed for a budget.
+ * @peak_rate_samples: number of samples used to calculate @peak_rate.
-+ * @bfq_max_budget: maximum budget allotted to a bfq_queue before rescheduling.
++ * @bfq_max_budget: maximum budget allotted to a bfq_queue before
++ * rescheduling.
+ * @group_list: list of all the bfq_groups active on the device.
+ * @active_list: list of all the bfq_queues active on the device.
+ * @idle_list: list of all the bfq_queues idle on the device.
@@ -5758,7 +6581,8 @@ index 0000000..3ca8482
+ * @bfq_back_penalty: weight of backward seeks wrt forward ones.
+ * @bfq_back_max: maximum allowed backward seek.
+ * @bfq_slice_idle: maximum idling time.
-+ * @bfq_user_max_budget: user-configured max budget value (0 for auto-tuning).
++ * @bfq_user_max_budget: user-configured max budget value
++ * (0 for auto-tuning).
+ * @bfq_max_budget_async_rq: maximum budget (in nr of requests) allotted to
+ * async queues.
+ * @bfq_timeout: timeout for bfq_queues to consume their budget; used to
@@ -5768,21 +6592,49 @@ index 0000000..3ca8482
+ * they are charged for the whole allocated budget, to try
+ * to preserve a behavior reasonably fair among them, but
+ * without service-domain guarantees).
-+ * @bfq_raising_coeff: Maximum factor by which the weight of a boosted
-+ * queue is multiplied
-+ * @bfq_raising_max_time: maximum duration of a weight-raising period (jiffies)
-+ * @bfq_raising_rt_max_time: maximum duration for soft real-time processes
-+ * @bfq_raising_min_idle_time: minimum idle period after which weight-raising
-+ * may be reactivated for a queue (in jiffies)
-+ * @bfq_raising_min_inter_arr_async: minimum period between request arrivals
-+ * after which weight-raising may be
-+ * reactivated for an already busy queue
-+ * (in jiffies)
-+ * @bfq_raising_max_softrt_rate: max service-rate for a soft real-time queue,
-+ * sectors per seconds
++ * @bfq_coop_thresh: number of queue merges after which a @bfq_queue is
++ * no more granted any weight-raising.
++ * @bfq_failed_cooperations: number of consecutive failed cooperation
++ * chances after which weight-raising is restored
++ * to a queue subject to more than bfq_coop_thresh
++ * queue merges.
++ * @bfq_requests_within_timer: number of consecutive requests that must be
++ * issued within the idle time slice to set
++ * again idling to a queue which was marked as
++ * non-I/O-bound (see the definition of the
++ * IO_bound flag for further details).
++ * @last_ins_in_burst: last time at which a queue entered the current
++ * burst of queues being activated shortly after
++ * each other; for more details about this and the
++ * following parameters related to a burst of
++ * activations, see the comments to the function
++ * @bfq_handle_burst.
++ * @bfq_burst_interval: reference time interval used to decide whether a
++ * queue has been activated shortly after
++ * @last_ins_in_burst.
++ * @burst_size: number of queues in the current burst of queue activations.
++ * @bfq_large_burst_thresh: maximum burst size above which the current
++ * queue-activation burst is deemed as 'large'.
++ * @large_burst: true if a large queue-activation burst is in progress.
++ * @burst_list: head of the burst list (as for the above fields, more details
++ * in the comments to the function bfq_handle_burst).
++ * @low_latency: if set to true, low-latency heuristics are enabled.
++ * @bfq_wr_coeff: maximum factor by which the weight of a weight-raised
++ * queue is multiplied.
++ * @bfq_wr_max_time: maximum duration of a weight-raising period (jiffies).
++ * @bfq_wr_rt_max_time: maximum duration for soft real-time processes.
++ * @bfq_wr_min_idle_time: minimum idle period after which weight-raising
++ * may be reactivated for a queue (in jiffies).
++ * @bfq_wr_min_inter_arr_async: minimum period between request arrivals
++ * after which weight-raising may be
++ * reactivated for an already busy queue
++ * (in jiffies).
++ * @bfq_wr_max_softrt_rate: max service-rate for a soft real-time queue,
++ * sectors per seconds.
+ * @RT_prod: cached value of the product R*T used for computing the maximum
-+ * duration of the weight raising automatically
-+ * @oom_bfqq: fallback dummy bfqq for extreme OOM conditions
++ * duration of the weight raising automatically.
++ * @device_speed: device-speed class for the low-latency heuristic.
++ * @oom_bfqq: fallback dummy bfqq for extreme OOM conditions.
+ *
+ * All the fields are protected by the @queue lock.
+ */
@@ -5790,11 +6642,19 @@ index 0000000..3ca8482
+ struct request_queue *queue;
+
+ struct bfq_group *root_group;
-+
+ struct rb_root rq_pos_tree;
+
++#ifdef CONFIG_CGROUP_BFQIO
++ int active_numerous_groups;
++#endif
++
++ struct rb_root queue_weights_tree;
++ struct rb_root group_weights_tree;
++
+ int busy_queues;
-+ int raised_busy_queues;
++ int busy_in_flight_queues;
++ int const_seeky_busy_in_flight_queues;
++ int wr_busy_queues;
+ int queued;
+ int rq_in_driver;
+ int sync_flight;
@@ -5834,22 +6694,34 @@ index 0000000..3ca8482
+ unsigned int bfq_max_budget_async_rq;
+ unsigned int bfq_timeout[2];
+
++ unsigned int bfq_coop_thresh;
++ unsigned int bfq_failed_cooperations;
++ unsigned int bfq_requests_within_timer;
++
++ unsigned long last_ins_in_burst;
++ unsigned long bfq_burst_interval;
++ int burst_size;
++ unsigned long bfq_large_burst_thresh;
++ bool large_burst;
++ struct hlist_head burst_list;
++
+ bool low_latency;
+
+ /* parameters of the low_latency heuristics */
-+ unsigned int bfq_raising_coeff;
-+ unsigned int bfq_raising_max_time;
-+ unsigned int bfq_raising_rt_max_time;
-+ unsigned int bfq_raising_min_idle_time;
-+ unsigned long bfq_raising_min_inter_arr_async;
-+ unsigned int bfq_raising_max_softrt_rate;
++ unsigned int bfq_wr_coeff;
++ unsigned int bfq_wr_max_time;
++ unsigned int bfq_wr_rt_max_time;
++ unsigned int bfq_wr_min_idle_time;
++ unsigned long bfq_wr_min_inter_arr_async;
++ unsigned int bfq_wr_max_softrt_rate;
+ u64 RT_prod;
++ enum bfq_device_speed device_speed;
+
+ struct bfq_queue oom_bfqq;
+};
+
+enum bfqq_state_flags {
-+ BFQ_BFQQ_FLAG_busy = 0, /* has requests or is under service */
++ BFQ_BFQQ_FLAG_busy = 0, /* has requests or is in service */
+ BFQ_BFQQ_FLAG_wait_request, /* waiting for a request */
+ BFQ_BFQQ_FLAG_must_alloc, /* must be allowed rq alloc */
+ BFQ_BFQQ_FLAG_fifo_expire, /* FIFO checked in this slice */
@@ -5857,9 +6729,25 @@ index 0000000..3ca8482
+ BFQ_BFQQ_FLAG_prio_changed, /* task priority has changed */
+ BFQ_BFQQ_FLAG_sync, /* synchronous queue */
+ BFQ_BFQQ_FLAG_budget_new, /* no completion with this budget */
++ BFQ_BFQQ_FLAG_IO_bound, /*
++ * bfqq has timed-out at least once
++ * having consumed at most 2/10 of
++ * its budget
++ */
++ BFQ_BFQQ_FLAG_in_large_burst, /*
++ * bfqq activated in a large burst,
++ * see comments to bfq_handle_burst.
++ */
++ BFQ_BFQQ_FLAG_constantly_seeky, /*
++ * bfqq has proved to be slow and
++ * seeky until budget timeout
++ */
++ BFQ_BFQQ_FLAG_softrt_update, /*
++ * may need softrt-next-start
++ * update
++ */
+ BFQ_BFQQ_FLAG_coop, /* bfqq is shared */
+ BFQ_BFQQ_FLAG_split_coop, /* shared bfqq will be splitted */
-+ BFQ_BFQQ_FLAG_softrt_update, /* needs softrt-next-start update */
+};
+
+#define BFQ_BFQQ_FNS(name) \
@@ -5884,6 +6772,9 @@ index 0000000..3ca8482
+BFQ_BFQQ_FNS(prio_changed);
+BFQ_BFQQ_FNS(sync);
+BFQ_BFQQ_FNS(budget_new);
++BFQ_BFQQ_FNS(IO_bound);
++BFQ_BFQQ_FNS(in_large_burst);
++BFQ_BFQQ_FNS(constantly_seeky);
+BFQ_BFQQ_FNS(coop);
+BFQ_BFQQ_FNS(split_coop);
+BFQ_BFQQ_FNS(softrt_update);
@@ -5898,7 +6789,10 @@ index 0000000..3ca8482
+
+/* Expiration reasons. */
+enum bfqq_expiration {
-+ BFQ_BFQQ_TOO_IDLE = 0, /* queue has been idling for too long */
++ BFQ_BFQQ_TOO_IDLE = 0, /*
++ * queue has been idling for
++ * too long
++ */
+ BFQ_BFQQ_BUDGET_TIMEOUT, /* budget took too long to be used */
+ BFQ_BFQQ_BUDGET_EXHAUSTED, /* budget consumed */
+ BFQ_BFQQ_NO_MORE_REQUESTS, /* the queue has no more requests */
@@ -5920,7 +6814,13 @@ index 0000000..3ca8482
+ * except for the idle class that has only one queue.
+ * @async_idle_bfqq: async queue for the idle class (ioprio is ignored).
+ * @my_entity: pointer to @entity, %NULL for the toplevel group; used
-+ * to avoid too many special cases during group creation/migration.
++ * to avoid too many special cases during group creation/
++ * migration.
++ * @active_entities: number of active entities belonging to the group;
++ * unused for the root group. Used to know whether there
++ * are groups with more than one active @bfq_entity
++ * (see the comments to the function
++ * bfq_bfqq_must_not_expire()).
+ *
+ * Each (device, cgroup) pair has its own bfq_group, i.e., for each cgroup
+ * there is a set of bfq_groups, each one collecting the lower-level
@@ -5946,6 +6846,8 @@ index 0000000..3ca8482
+ struct bfq_queue *async_idle_bfqq;
+
+ struct bfq_entity *my_entity;
++
++ int active_entities;
+};
+
+/**
@@ -5992,15 +6894,15 @@ index 0000000..3ca8482
+}
+
+static inline struct bfq_queue *bic_to_bfqq(struct bfq_io_cq *bic,
-+ int is_sync)
++ bool is_sync)
+{
-+ return bic->bfqq[!!is_sync];
++ return bic->bfqq[is_sync];
+}
+
+static inline void bic_set_bfqq(struct bfq_io_cq *bic,
-+ struct bfq_queue *bfqq, int is_sync)
++ struct bfq_queue *bfqq, bool is_sync)
+{
-+ bic->bfqq[!!is_sync] = bfqq;
++ bic->bfqq[is_sync] = bfqq;
+}
+
+static inline struct bfq_data *bic_to_bfqd(struct bfq_io_cq *bic)
@@ -6055,11 +6957,12 @@ index 0000000..3ca8482
+static struct bfq_queue *bfq_get_queue(struct bfq_data *bfqd,
+ struct bfq_group *bfqg, int is_sync,
+ struct bfq_io_cq *bic, gfp_t gfp_mask);
-+static void bfq_end_raising_async_queues(struct bfq_data *bfqd,
-+ struct bfq_group *bfqg);
++static void bfq_end_wr_async_queues(struct bfq_data *bfqd,
++ struct bfq_group *bfqg);
+static void bfq_put_async_queues(struct bfq_data *bfqd, struct bfq_group *bfqg);
+static void bfq_exit_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq);
-+#endif
++
++#endif /* _BFQ_H */
--
-1.9.0
+2.1.3
diff --git a/5003_BFQ-3-block-bfq-add-Early-Queue-Merge-EQM-to-BFQ-v7r2-for-3.14.0.patch b/5003_BFQ-3-block-bfq-add-Early-Queue-Merge-EQM-to-BFQ-v7r7-for-3.14.0.patch
similarity index 63%
rename from 5003_BFQ-3-block-bfq-add-Early-Queue-Merge-EQM-to-BFQ-v7r2-for-3.14.0.patch
rename to 5003_BFQ-3-block-bfq-add-Early-Queue-Merge-EQM-to-BFQ-v7r7-for-3.14.0.patch
index 010450b..4ff37ea 100644
--- a/5003_BFQ-3-block-bfq-add-Early-Queue-Merge-EQM-to-BFQ-v7r2-for-3.14.0.patch
+++ b/5003_BFQ-3-block-bfq-add-Early-Queue-Merge-EQM-to-BFQ-v7r7-for-3.14.0.patch
@@ -1,15 +1,15 @@
-From 4fbeda28a90d7fccd05d28a89d9fc409b2344e0a Mon Sep 17 00:00:00 2001
+From c1e7cf789b3812aaa71fbef0adce65aac3c28de7 Mon Sep 17 00:00:00 2001
From: Mauro Andreolini <mauro.andreolini@unimore.it>
-Date: Fri, 14 Feb 2014 12:52:49 +0100
-Subject: [PATCH 3/3] block, bfq: add Early Queue Merge (EQM) to BFQ-v7r2 for
+Date: Thu, 18 Dec 2014 22:03:16 +0100
+Subject: [PATCH 3/3] block, bfq: add Early Queue Merge (EQM) to BFQ-v7r7 for
3.14.0
-A set of processes may happen to perform interleaved reads, i.e., requests
+A set of processes may happen to perform interleaved reads, i.e.,requests
whose union would give rise to a sequential read pattern. There are two
typical cases: in the first case, processes read fixed-size chunks of
data at a fixed distance from each other, while in the second case processes
may read variable-size chunks at variable distances. The latter case occurs
-for example with KVM, which splits the I/O generated by the guest into
+for example with QEMU, which splits the I/O generated by the guest into
multiple chunks, and lets these chunks be served by a pool of cooperating
processes, iteratively assigning the next chunk of I/O to the first
available process. CFQ uses actual queue merging for the first type of
@@ -32,21 +32,27 @@ a non-merged state.
Signed-off-by: Mauro Andreolini <mauro.andreolini@unimore.it>
Signed-off-by: Arianna Avanzini <avanzini.arianna@gmail.com>
-Reviewed-by: Paolo Valente <paolo.valente@unimore.it>
+Signed-off-by: Paolo Valente <paolo.valente@unimore.it>
---
- block/bfq-iosched.c | 657 ++++++++++++++++++++++++++++++++++++----------------
- block/bfq-sched.c | 28 ---
- block/bfq.h | 20 +-
- 3 files changed, 476 insertions(+), 229 deletions(-)
+ block/bfq-iosched.c | 751 +++++++++++++++++++++++++++++++++++++---------------
+ block/bfq-sched.c | 28 --
+ block/bfq.h | 54 +++-
+ 3 files changed, 581 insertions(+), 252 deletions(-)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
-index f5f71e4..0d3503d 100644
+index 5aa5f09..c133e23 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
-@@ -445,6 +445,46 @@ static inline unsigned int bfq_wrais_duration(struct bfq_data *bfqd)
+@@ -571,6 +571,57 @@ static inline unsigned int bfq_wr_duration(struct bfq_data *bfqd)
return dur;
}
++static inline unsigned
++bfq_bfqq_cooperations(struct bfq_queue *bfqq)
++{
++ return bfqq->bic ? bfqq->bic->cooperations : 0;
++}
++
+static inline void
+bfq_bfqq_resume_state(struct bfq_queue *bfqq, struct bfq_io_cq *bic)
+{
@@ -54,29 +60,34 @@ index f5f71e4..0d3503d 100644
+ bfq_mark_bfqq_idle_window(bfqq);
+ else
+ bfq_clear_bfqq_idle_window(bfqq);
-+ if (bic->raising_time_left && bfqq->bfqd->low_latency) {
++ if (bic->saved_IO_bound)
++ bfq_mark_bfqq_IO_bound(bfqq);
++ else
++ bfq_clear_bfqq_IO_bound(bfqq);
++ /* Assuming that the flag in_large_burst is already correctly set */
++ if (bic->wr_time_left && bfqq->bfqd->low_latency &&
++ !bfq_bfqq_in_large_burst(bfqq) &&
++ bic->cooperations < bfqq->bfqd->bfq_coop_thresh) {
+ /*
+ * Start a weight raising period with the duration given by
+ * the raising_time_left snapshot.
+ */
+ if (bfq_bfqq_busy(bfqq))
-+ bfqq->bfqd->raised_busy_queues++;
-+ bfqq->raising_coeff = bfqq->bfqd->bfq_raising_coeff;
-+ bfqq->raising_cur_max_time = bic->raising_time_left;
-+ bfqq->last_rais_start_finish = jiffies;
++ bfqq->bfqd->wr_busy_queues++;
++ bfqq->wr_coeff = bfqq->bfqd->bfq_wr_coeff;
++ bfqq->wr_cur_max_time = bic->wr_time_left;
++ bfqq->last_wr_start_finish = jiffies;
+ bfqq->entity.ioprio_changed = 1;
+ }
+ /*
-+ * Clear raising_time_left to prevent bfq_bfqq_save_state() from
++ * Clear wr_time_left to prevent bfq_bfqq_save_state() from
+ * getting confused about the queue's need of a weight-raising
+ * period.
+ */
-+ bic->raising_time_left = 0;
++ bic->wr_time_left = 0;
+}
+
-+/*
-+ * Must be called with the queue_lock held.
-+ */
++/* Must be called with the queue_lock held. */
+static int bfqq_process_refs(struct bfq_queue *bfqq)
+{
+ int process_refs, io_refs;
@@ -87,10 +98,35 @@ index f5f71e4..0d3503d 100644
+ return process_refs;
+}
+
- static void bfq_add_rq_rb(struct request *rq)
- {
- struct bfq_queue *bfqq = RQ_BFQQ(rq);
-@@ -486,12 +526,20 @@ static void bfq_add_rq_rb(struct request *rq)
+ /* Empty burst list and add just bfqq (see comments to bfq_handle_burst) */
+ static inline void bfq_reset_burst_list(struct bfq_data *bfqd,
+ struct bfq_queue *bfqq)
+@@ -815,7 +866,7 @@ static void bfq_add_request(struct request *rq)
+ bfq_rq_pos_tree_add(bfqd, bfqq);
+
+ if (!bfq_bfqq_busy(bfqq)) {
+- bool soft_rt,
++ bool soft_rt, coop_or_in_burst,
+ idle_for_long_time = time_is_before_jiffies(
+ bfqq->budget_timeout +
+ bfqd->bfq_wr_min_idle_time);
+@@ -839,11 +890,12 @@ static void bfq_add_request(struct request *rq)
+ bfqd->last_ins_in_burst = jiffies;
+ }
+
++ coop_or_in_burst = bfq_bfqq_in_large_burst(bfqq) ||
++ bfq_bfqq_cooperations(bfqq) >= bfqd->bfq_coop_thresh;
+ soft_rt = bfqd->bfq_wr_max_softrt_rate > 0 &&
+- !bfq_bfqq_in_large_burst(bfqq) &&
++ !coop_or_in_burst &&
+ time_is_before_jiffies(bfqq->soft_rt_next_start);
+- interactive = !bfq_bfqq_in_large_burst(bfqq) &&
+- idle_for_long_time;
++ interactive = !coop_or_in_burst && idle_for_long_time;
+ entity->budget = max_t(unsigned long, bfqq->max_budget,
+ bfq_serv_to_charge(next_rq, bfqq));
+
+@@ -862,11 +914,20 @@ static void bfq_add_request(struct request *rq)
if (!bfqd->low_latency)
goto add_bfqq_busy;
@@ -108,28 +144,68 @@ index f5f71e4..0d3503d 100644
+ * requests have not been redirected to a shared queue)
+ * start a weight-raising period.
*/
-- if (old_raising_coeff == 1 &&
-- (idle_for_long_time || soft_rt)) {
-+ if (old_raising_coeff == 1 && (idle_for_long_time || soft_rt) &&
+- if (old_wr_coeff == 1 && (interactive || soft_rt)) {
++ if (old_wr_coeff == 1 && (interactive || soft_rt) &&
+ (!bfq_bfqq_sync(bfqq) || bfqq->bic != NULL)) {
- bfqq->raising_coeff = bfqd->bfq_raising_coeff;
- if (idle_for_long_time)
- bfqq->raising_cur_max_time =
-@@ -574,6 +622,7 @@ static void bfq_add_rq_rb(struct request *rq)
- bfqd->bfq_raising_rt_max_time;
+ bfqq->wr_coeff = bfqd->bfq_wr_coeff;
+ if (interactive)
+ bfqq->wr_cur_max_time = bfq_wr_duration(bfqd);
+@@ -880,7 +941,7 @@ static void bfq_add_request(struct request *rq)
+ } else if (old_wr_coeff > 1) {
+ if (interactive)
+ bfqq->wr_cur_max_time = bfq_wr_duration(bfqd);
+- else if (bfq_bfqq_in_large_burst(bfqq) ||
++ else if (coop_or_in_burst ||
+ (bfqq->wr_cur_max_time ==
+ bfqd->bfq_wr_rt_max_time &&
+ !soft_rt)) {
+@@ -899,18 +960,18 @@ static void bfq_add_request(struct request *rq)
+ /*
+ *
+ * The remaining weight-raising time is lower
+- * than bfqd->bfq_wr_rt_max_time, which
+- * means that the application is enjoying
+- * weight raising either because deemed soft-
+- * rt in the near past, or because deemed
+- * interactive a long ago. In both cases,
+- * resetting now the current remaining weight-
+- * raising time for the application to the
+- * weight-raising duration for soft rt
+- * applications would not cause any latency
+- * increase for the application (as the new
+- * duration would be higher than the remaining
+- * time).
++ * than bfqd->bfq_wr_rt_max_time, which means
++ * that the application is enjoying weight
++ * raising either because deemed soft-rt in
++ * the near past, or because deemed interactive
++ * a long ago.
++ * In both cases, resetting now the current
++ * remaining weight-raising time for the
++ * application to the weight-raising duration
++ * for soft rt applications would not cause any
++ * latency increase for the application (as the
++ * new duration would be higher than the
++ * remaining time).
+ *
+ * In addition, the application is now meeting
+ * the requirements for being deemed soft rt.
+@@ -945,6 +1006,7 @@ static void bfq_add_request(struct request *rq)
+ bfqd->bfq_wr_rt_max_time;
}
}
+set_ioprio_changed:
- if (old_raising_coeff != bfqq->raising_coeff)
+ if (old_wr_coeff != bfqq->wr_coeff)
entity->ioprio_changed = 1;
add_bfqq_busy:
-@@ -756,90 +805,35 @@ static void bfq_end_raising(struct bfq_data *bfqd)
+@@ -1156,90 +1218,35 @@ static void bfq_end_wr(struct bfq_data *bfqd)
spin_unlock_irq(bfqd->queue->queue_lock);
}
-static int bfq_allow_merge(struct request_queue *q, struct request *rq,
- struct bio *bio)
--{
++static inline sector_t bfq_io_struct_pos(void *io_struct, bool request)
+ {
- struct bfq_data *bfqd = q->elevator->elevator_data;
- struct bfq_io_cq *bic;
- struct bfq_queue *bfqq;
@@ -176,8 +252,7 @@ index f5f71e4..0d3503d 100644
- */
-static struct bfq_queue *bfq_set_in_service_queue(struct bfq_data *bfqd,
- struct bfq_queue *bfqq)
-+static inline sector_t bfq_io_struct_pos(void *io_struct, bool request)
- {
+-{
- if (!bfqq)
- bfqq = bfq_get_next_queue(bfqd);
+ if (request)
@@ -228,8 +303,8 @@ index f5f71e4..0d3503d 100644
if (RB_EMPTY_ROOT(root))
return NULL;
-@@ -858,7 +852,7 @@ static struct bfq_queue *bfqq_close(struct bfq_data *bfqd)
- * position).
+@@ -1258,7 +1265,7 @@ static struct bfq_queue *bfqq_close(struct bfq_data *bfqd)
+ * next_request position).
*/
__bfqq = rb_entry(parent, struct bfq_queue, pos_node);
- if (bfq_rq_close(bfqd, __bfqq->next_rq))
@@ -237,7 +312,7 @@ index f5f71e4..0d3503d 100644
return __bfqq;
if (blk_rq_pos(__bfqq->next_rq) < sector)
-@@ -869,7 +863,7 @@ static struct bfq_queue *bfqq_close(struct bfq_data *bfqd)
+@@ -1269,7 +1276,7 @@ static struct bfq_queue *bfqq_close(struct bfq_data *bfqd)
return NULL;
__bfqq = rb_entry(node, struct bfq_queue, pos_node);
@@ -246,7 +321,7 @@ index f5f71e4..0d3503d 100644
return __bfqq;
return NULL;
-@@ -878,14 +872,12 @@ static struct bfq_queue *bfqq_close(struct bfq_data *bfqd)
+@@ -1278,14 +1285,12 @@ static struct bfq_queue *bfqq_close(struct bfq_data *bfqd)
/*
* bfqd - obvious
* cur_bfqq - passed in so that we don't decide that the current queue
@@ -265,7 +340,7 @@ index f5f71e4..0d3503d 100644
{
struct bfq_queue *bfqq;
-@@ -905,7 +897,7 @@ static struct bfq_queue *bfq_close_cooperator(struct bfq_data *bfqd,
+@@ -1305,7 +1310,7 @@ static struct bfq_queue *bfq_close_cooperator(struct bfq_data *bfqd,
* working closely on the same area of the disk. In that case,
* we can group them together and don't waste time idling.
*/
@@ -274,7 +349,7 @@ index f5f71e4..0d3503d 100644
if (bfqq == NULL || bfqq == cur_bfqq)
return NULL;
-@@ -932,6 +924,282 @@ static struct bfq_queue *bfq_close_cooperator(struct bfq_data *bfqd,
+@@ -1332,6 +1337,315 @@ static struct bfq_queue *bfq_close_cooperator(struct bfq_data *bfqd,
return bfqq;
}
@@ -313,24 +388,26 @@ index f5f71e4..0d3503d 100644
+ new_bfqq->pid);
+
+ /*
-+ * Merging is just a redirection: the requests of the process owning
-+ * one of the two queues are redirected to the other queue. The latter
-+ * queue, in its turn, is set as shared if this is the first time that
-+ * the requests of some process are redirected to it.
++ * Merging is just a redirection: the requests of the process
++ * owning one of the two queues are redirected to the other queue.
++ * The latter queue, in its turn, is set as shared if this is the
++ * first time that the requests of some process are redirected to
++ * it.
+ *
+ * We redirect bfqq to new_bfqq and not the opposite, because we
-+ * are in the context of the process owning bfqq, hence we have the
-+ * io_cq of this process. So we can immediately configure this io_cq
-+ * to redirect the requests of the process to new_bfqq.
++ * are in the context of the process owning bfqq, hence we have
++ * the io_cq of this process. So we can immediately configure this
++ * io_cq to redirect the requests of the process to new_bfqq.
+ *
+ * NOTE, even if new_bfqq coincides with the in-service queue, the
-+ * io_cq of new_bfqq is not available, because, if the in-service queue
-+ * is shared, bfqd->in_service_bic may not point to the io_cq of the
-+ * in-service queue.
-+ * Redirecting the requests of the process owning bfqq to the currently
-+ * in-service queue is in any case the best option, as we feed the
-+ * in-service queue with new requests close to the last request served
-+ * and, by doing so, hopefully increase the throughput.
++ * io_cq of new_bfqq is not available, because, if the in-service
++ * queue is shared, bfqd->in_service_bic may not point to the
++ * io_cq of the in-service queue.
++ * Redirecting the requests of the process owning bfqq to the
++ * currently in-service queue is in any case the best option, as
++ * we feed the in-service queue with new requests close to the
++ * last request served and, by doing so, hopefully increase the
++ * throughput.
+ */
+ bfqq->new_bfqq = new_bfqq;
+ atomic_add(process_refs, &new_bfqq->ref);
@@ -338,10 +415,17 @@ index f5f71e4..0d3503d 100644
+}
+
+/*
-+ * Attempt to schedule a merge of bfqq with the currently in-service queue or
-+ * with a close queue among the scheduled queues.
++ * Attempt to schedule a merge of bfqq with the currently in-service queue
++ * or with a close queue among the scheduled queues.
+ * Return NULL if no merge was scheduled, a pointer to the shared bfq_queue
+ * structure otherwise.
++ *
++ * The OOM queue is not allowed to participate to cooperation: in fact, since
++ * the requests temporarily redirected to the OOM queue could be redirected
++ * again to dedicated queues at any time, the state needed to correctly
++ * handle merging with the OOM queue would be quite complex and expensive
++ * to maintain. Besides, in such a critical condition as an out of memory,
++ * the benefits of queue merging may be little relevant, or even negligible.
+ */
+static struct bfq_queue *
+bfq_setup_cooperator(struct bfq_data *bfqd, struct bfq_queue *bfqq,
@@ -352,13 +436,14 @@ index f5f71e4..0d3503d 100644
+ if (bfqq->new_bfqq)
+ return bfqq->new_bfqq;
+
-+ if (!io_struct)
++ if (!io_struct || unlikely(bfqq == &bfqd->oom_bfqq))
+ return NULL;
+
+ in_service_bfqq = bfqd->in_service_queue;
+
+ if (in_service_bfqq == NULL || in_service_bfqq == bfqq ||
-+ !bfqd->in_service_bic)
++ !bfqd->in_service_bic ||
++ unlikely(in_service_bfqq == &bfqd->oom_bfqq))
+ goto check_scheduled;
+
+ if (bfq_class_idle(in_service_bfqq) || bfq_class_idle(bfqq))
@@ -374,7 +459,7 @@ index f5f71e4..0d3503d 100644
+ bfq_bfqq_sync(in_service_bfqq) && bfq_bfqq_sync(bfqq)) {
+ new_bfqq = bfq_setup_merge(bfqq, in_service_bfqq);
+ if (new_bfqq != NULL)
-+ return new_bfqq; /* Merge with the in-service queue */
++ return new_bfqq; /* Merge with in-service queue */
+ }
+
+ /*
@@ -385,7 +470,7 @@ index f5f71e4..0d3503d 100644
+check_scheduled:
+ new_bfqq = bfq_close_cooperator(bfqd, bfqq,
+ bfq_io_struct_pos(io_struct, request));
-+ if (new_bfqq)
++ if (new_bfqq && likely(new_bfqq != &bfqd->oom_bfqq))
+ return bfq_setup_merge(bfqq, new_bfqq);
+
+ return NULL;
@@ -401,41 +486,46 @@ index f5f71e4..0d3503d 100644
+ */
+ if (bfqq->bic == NULL)
+ return;
-+ if (bfqq->bic->raising_time_left)
++ if (bfqq->bic->wr_time_left)
+ /*
+ * This is the queue of a just-started process, and would
-+ * deserve weight raising: we set raising_time_left to the full
-+ * weight-raising duration to trigger weight-raising when and
-+ * if the queue is split and the first request of the queue
-+ * is enqueued.
++ * deserve weight raising: we set wr_time_left to the full
++ * weight-raising duration to trigger weight-raising when
++ * and if the queue is split and the first request of the
++ * queue is enqueued.
+ */
-+ bfqq->bic->raising_time_left = bfq_wrais_duration(bfqq->bfqd);
-+ else if (bfqq->raising_coeff > 1) {
-+ unsigned long wrais_duration =
-+ jiffies - bfqq->last_rais_start_finish;
++ bfqq->bic->wr_time_left = bfq_wr_duration(bfqq->bfqd);
++ else if (bfqq->wr_coeff > 1) {
++ unsigned long wr_duration =
++ jiffies - bfqq->last_wr_start_finish;
+ /*
+ * It may happen that a queue's weight raising period lasts
-+ * longer than its raising_cur_max_time, as weight raising is
++ * longer than its wr_cur_max_time, as weight raising is
+ * handled only when a request is enqueued or dispatched (it
+ * does not use any timer). If the weight raising period is
+ * about to end, don't save it.
+ */
-+ if (bfqq->raising_cur_max_time <= wrais_duration)
-+ bfqq->bic->raising_time_left = 0;
++ if (bfqq->wr_cur_max_time <= wr_duration)
++ bfqq->bic->wr_time_left = 0;
+ else
-+ bfqq->bic->raising_time_left =
-+ bfqq->raising_cur_max_time - wrais_duration;
++ bfqq->bic->wr_time_left =
++ bfqq->wr_cur_max_time - wr_duration;
+ /*
+ * The bfq_queue is becoming shared or the requests of the
+ * process owning the queue are being redirected to a shared
+ * queue. Stop the weight raising period of the queue, as in
-+ * both cases it should not be owned by an interactive or soft
-+ * real-time application.
++ * both cases it should not be owned by an interactive or
++ * soft real-time application.
+ */
-+ bfq_bfqq_end_raising(bfqq);
++ bfq_bfqq_end_wr(bfqq);
+ } else
-+ bfqq->bic->raising_time_left = 0;
++ bfqq->bic->wr_time_left = 0;
+ bfqq->bic->saved_idle_window = bfq_bfqq_idle_window(bfqq);
++ bfqq->bic->saved_IO_bound = bfq_bfqq_IO_bound(bfqq);
++ bfqq->bic->saved_in_large_burst = bfq_bfqq_in_large_burst(bfqq);
++ bfqq->bic->was_in_burst_list = !hlist_unhashed(&bfqq->burst_list_node);
++ bfqq->bic->cooperations++;
++ bfqq->bic->failed_cooperations = 0;
+}
+
+static inline void
@@ -458,23 +548,28 @@ index f5f71e4..0d3503d 100644
+ /* Save weight raising and idle window of the merged queues */
+ bfq_bfqq_save_state(bfqq);
+ bfq_bfqq_save_state(new_bfqq);
++ if (bfq_bfqq_IO_bound(bfqq))
++ bfq_mark_bfqq_IO_bound(new_bfqq);
++ bfq_clear_bfqq_IO_bound(bfqq);
+ /*
+ * Grab a reference to the bic, to prevent it from being destroyed
+ * before being possibly touched by a bfq_split_bfqq().
+ */
+ bfq_get_bic_reference(bfqq);
+ bfq_get_bic_reference(new_bfqq);
-+ /* Merge queues (that is, let bic redirect its requests to new_bfqq) */
++ /*
++ * Merge queues (that is, let bic redirect its requests to new_bfqq)
++ */
+ bic_set_bfqq(bic, new_bfqq, 1);
+ bfq_mark_bfqq_coop(new_bfqq);
+ /*
-+ * new_bfqq now belongs to at least two bics (it is a shared queue): set
-+ * new_bfqq->bic to NULL. bfqq either:
++ * new_bfqq now belongs to at least two bics (it is a shared queue):
++ * set new_bfqq->bic to NULL. bfqq either:
+ * - does not belong to any bic any more, and hence bfqq->bic must
+ * be set to NULL, or
+ * - is a queue whose owning bics have already been redirected to a
-+ * different queue, hence the queue is destined to not belong to any
-+ * bic soon and bfqq->bic is already NULL (therefore the next
++ * different queue, hence the queue is destined to not belong to
++ * any bic soon and bfqq->bic is already NULL (therefore the next
+ * assignment causes no harm).
+ */
+ new_bfqq->bic = NULL;
@@ -482,6 +577,18 @@ index f5f71e4..0d3503d 100644
+ bfq_put_queue(bfqq);
+}
+
++static inline void bfq_bfqq_increase_failed_cooperations(struct bfq_queue *bfqq)
++{
++ struct bfq_io_cq *bic = bfqq->bic;
++ struct bfq_data *bfqd = bfqq->bfqd;
++
++ if (bic && bfq_bfqq_cooperations(bfqq) >= bfqd->bfq_coop_thresh) {
++ bic->failed_cooperations++;
++ if (bic->failed_cooperations >= bfqd->bfq_failed_cooperations)
++ bic->cooperations = 0;
++ }
++}
++
+static int bfq_allow_merge(struct request_queue *q, struct request *rq,
+ struct bio *bio)
+{
@@ -514,12 +621,13 @@ index f5f71e4..0d3503d 100644
+ if (new_bfqq != NULL) {
+ bfq_merge_bfqqs(bfqd, bic, bfqq, new_bfqq);
+ /*
-+ * If we get here, the bio will be queued in the shared
-+ * queue, i.e., new_bfqq, so use new_bfqq to decide
-+ * whether bio and rq can be merged.
++ * If we get here, the bio will be queued in the
++ * shared queue, i.e., new_bfqq, so use new_bfqq
++ * to decide whether bio and rq can be merged.
+ */
+ bfqq = new_bfqq;
-+ }
++ } else
++ bfq_bfqq_increase_failed_cooperations(bfqq);
+ }
+
+ return bfqq == RQ_BFQQ(rq);
@@ -557,13 +665,11 @@ index f5f71e4..0d3503d 100644
/*
* If enough samples have been computed, return the current max budget
* stored in bfqd, which is dynamically updated according to the
-@@ -1079,63 +1347,6 @@ static struct request *bfq_check_fifo(struct bfq_queue *bfqq)
+@@ -1475,61 +1789,6 @@ static struct request *bfq_check_fifo(struct bfq_queue *bfqq)
return rq;
}
--/*
-- * Must be called with the queue_lock held.
-- */
+-/* Must be called with the queue_lock held. */
-static int bfqq_process_refs(struct bfq_queue *bfqq)
-{
- int process_refs, io_refs;
@@ -621,7 +727,7 @@ index f5f71e4..0d3503d 100644
static inline unsigned long bfq_bfqq_budget_left(struct bfq_queue *bfqq)
{
struct bfq_entity *entity = &bfqq->entity;
-@@ -1729,7 +1940,7 @@ static inline bool bfq_bfqq_must_idle(struct bfq_queue *bfqq)
+@@ -2263,7 +2522,7 @@ static inline bool bfq_bfqq_must_idle(struct bfq_queue *bfqq)
*/
static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd)
{
@@ -630,7 +736,7 @@ index f5f71e4..0d3503d 100644
struct request *next_rq;
enum bfqq_expiration reason = BFQ_BFQQ_BUDGET_TIMEOUT;
-@@ -1739,17 +1950,6 @@ static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd)
+@@ -2273,17 +2532,6 @@ static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd)
bfq_log_bfqq(bfqd, bfqq, "select_queue: already in-service queue");
@@ -648,7 +754,7 @@ index f5f71e4..0d3503d 100644
if (bfq_may_expire_for_budg_timeout(bfqq) &&
!timer_pending(&bfqd->idle_slice_timer) &&
!bfq_bfqq_must_idle(bfqq))
-@@ -1786,36 +1986,26 @@ static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd)
+@@ -2322,10 +2570,7 @@ static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd)
bfq_clear_bfqq_wait_request(bfqq);
del_timer(&bfqd->idle_slice_timer);
}
@@ -660,13 +766,9 @@ index f5f71e4..0d3503d 100644
}
}
- /*
-- * No requests pending. If the in-service queue has no cooperator and
-- * still has requests in flight (possibly waiting for a completion)
-- * or is idling for a new request, then keep it.
-+ * No requests pending. If the in-service queue still has requests in
-+ * flight (possibly waiting for a completion) or is idling for a new
-+ * request, then keep it.
+@@ -2334,40 +2579,30 @@ static struct bfq_queue *bfq_select_queue(struct bfq_data *bfqd)
+ * in flight (possibly waiting for a completion) or is idling for a
+ * new request, then keep it.
*/
- if (new_bfqq == NULL && (timer_pending(&bfqd->idle_slice_timer) ||
- (bfqq->dispatched != 0 && bfq_bfqq_must_not_expire(bfqq)))) {
@@ -692,44 +794,62 @@ index f5f71e4..0d3503d 100644
bfq_log(bfqd, "select_queue: new queue %d returned",
bfqq != NULL ? bfqq->pid : 0);
keep_queue:
-@@ -1825,9 +2015,8 @@ keep_queue:
- static void bfq_update_raising_data(struct bfq_data *bfqd,
- struct bfq_queue *bfqq)
+ return bfqq;
+ }
+
+-static void bfq_update_wr_data(struct bfq_data *bfqd,
+- struct bfq_queue *bfqq)
++static void bfq_update_wr_data(struct bfq_data *bfqd, struct bfq_queue *bfqq)
{
-+ struct bfq_entity *entity = &bfqq->entity;
- if (bfqq->raising_coeff > 1) { /* queue is being boosted */
+- if (bfqq->wr_coeff > 1) { /* queue is being boosted */
- struct bfq_entity *entity = &bfqq->entity;
-
++ struct bfq_entity *entity = &bfqq->entity;
++ if (bfqq->wr_coeff > 1) { /* queue is being weight-raised */
bfq_log_bfqq(bfqd, bfqq,
- "raising period dur %u/%u msec, "
- "old raising coeff %u, w %d(%d)",
-@@ -1844,7 +2033,7 @@ static void bfq_update_raising_data(struct bfq_data *bfqd,
- "WARN: pending prio change");
+ "raising period dur %u/%u msec, old coeff %u, w %d(%d)",
+- jiffies_to_msecs(jiffies -
+- bfqq->last_wr_start_finish),
++ jiffies_to_msecs(jiffies - bfqq->last_wr_start_finish),
+ jiffies_to_msecs(bfqq->wr_cur_max_time),
+ bfqq->wr_coeff,
+ bfqq->entity.weight, bfqq->entity.orig_weight);
+@@ -2376,12 +2611,16 @@ static void bfq_update_wr_data(struct bfq_data *bfqd,
+ entity->orig_weight * bfqq->wr_coeff);
+ if (entity->ioprio_changed)
+ bfq_log_bfqq(bfqd, bfqq, "WARN: pending prio change");
++
/*
- * If too much time has elapsed from the beginning
-- * of this weight-raising, stop it.
-+ * of this weight-raising period, stop it.
+ * If the queue was activated in a burst, or
+ * too much time has elapsed from the beginning
+- * of this weight-raising, then end weight raising.
++ * of this weight-raising period, or the queue has
++ * exceeded the acceptable number of cooperations,
++ * then end weight raising.
*/
- if (time_is_before_jiffies(bfqq->last_rais_start_finish +
- bfqq->raising_cur_max_time)) {
-@@ -1856,11 +2045,13 @@ static void bfq_update_raising_data(struct bfq_data *bfqd,
- jiffies_to_msecs(bfqq->
- raising_cur_max_time));
- bfq_bfqq_end_raising(bfqq);
+ if (bfq_bfqq_in_large_burst(bfqq) ||
++ bfq_bfqq_cooperations(bfqq) >= bfqd->bfq_coop_thresh ||
+ time_is_before_jiffies(bfqq->last_wr_start_finish +
+ bfqq->wr_cur_max_time)) {
+ bfqq->last_wr_start_finish = jiffies;
+@@ -2390,11 +2629,13 @@ static void bfq_update_wr_data(struct bfq_data *bfqd,
+ bfqq->last_wr_start_finish,
+ jiffies_to_msecs(bfqq->wr_cur_max_time));
+ bfq_bfqq_end_wr(bfqq);
- __bfq_entity_update_weight_prio(
- bfq_entity_service_tree(entity),
- entity);
}
}
+ /* Update weight both if it must be raised and if it must be lowered */
-+ if ((entity->weight > entity->orig_weight) != (bfqq->raising_coeff > 1))
++ if ((entity->weight > entity->orig_weight) != (bfqq->wr_coeff > 1))
+ __bfq_entity_update_weight_prio(
+ bfq_entity_service_tree(entity),
+ entity);
}
/*
-@@ -2101,6 +2292,25 @@ static void bfq_init_icq(struct io_cq *icq)
+@@ -2642,6 +2883,25 @@ static inline void bfq_init_icq(struct io_cq *icq)
struct bfq_io_cq *bic = icq_to_bic(icq);
bic->ttime.last_end_request = jiffies;
@@ -751,11 +871,11 @@ index f5f71e4..0d3503d 100644
+ * the field raising_time_left and assign 1 to it, to mark the queue
+ * as needing weight raising.
+ */
-+ bic->raising_time_left = 1;
++ bic->wr_time_left = 1;
}
static void bfq_exit_icq(struct io_cq *icq)
-@@ -2114,6 +2324,13 @@ static void bfq_exit_icq(struct io_cq *icq)
+@@ -2655,6 +2915,13 @@ static void bfq_exit_icq(struct io_cq *icq)
}
if (bic->bfqq[BLK_RW_SYNC]) {
@@ -769,7 +889,7 @@ index f5f71e4..0d3503d 100644
bfq_exit_bfqq(bfqd, bic->bfqq[BLK_RW_SYNC]);
bic->bfqq[BLK_RW_SYNC] = NULL;
}
-@@ -2405,6 +2622,10 @@ static void bfq_update_idle_window(struct bfq_data *bfqd,
+@@ -2950,6 +3217,10 @@ static void bfq_update_idle_window(struct bfq_data *bfqd,
if (!bfq_bfqq_sync(bfqq) || bfq_class_idle(bfqq))
return;
@@ -780,7 +900,7 @@ index f5f71e4..0d3503d 100644
enable_idle = bfq_bfqq_idle_window(bfqq);
if (atomic_read(&bic->icq.ioc->active_ref) == 0 ||
-@@ -2445,6 +2666,7 @@ static void bfq_rq_enqueued(struct bfq_data *bfqd, struct bfq_queue *bfqq,
+@@ -2997,6 +3268,7 @@ static void bfq_rq_enqueued(struct bfq_data *bfqd, struct bfq_queue *bfqq,
if (bfqq->entity.service > bfq_max_budget(bfqd) / 8 ||
!BFQQ_SEEKY(bfqq))
bfq_update_idle_window(bfqd, bfqq, bic);
@@ -788,7 +908,7 @@ index f5f71e4..0d3503d 100644
bfq_log_bfqq(bfqd, bfqq,
"rq_enqueued: idle_window=%d (seeky %d, mean %llu)",
-@@ -2505,13 +2727,48 @@ static void bfq_rq_enqueued(struct bfq_data *bfqd, struct bfq_queue *bfqq,
+@@ -3057,13 +3329,49 @@ static void bfq_rq_enqueued(struct bfq_data *bfqd, struct bfq_queue *bfqq,
static void bfq_insert_request(struct request_queue *q, struct request *rq)
{
struct bfq_data *bfqd = q->elevator->elevator_data;
@@ -820,12 +940,13 @@ index f5f71e4..0d3503d 100644
+ bfqq, new_bfqq);
+ rq->elv.priv[1] = new_bfqq;
+ bfqq = new_bfqq;
-+ }
++ } else
++ bfq_bfqq_increase_failed_cooperations(bfqq);
+ }
+
bfq_init_prio_data(bfqq, RQ_BIC(rq));
- bfq_add_rq_rb(rq);
+ bfq_add_request(rq);
+ /*
+ * Here a newly-created bfq_queue has already started a weight-raising
@@ -834,11 +955,11 @@ index f5f71e4..0d3503d 100644
+ * comments about this field in bfq_init_icq().
+ */
+ if (bfqq->bic != NULL)
-+ bfqq->bic->raising_time_left = 0;
++ bfqq->bic->wr_time_left = 0;
rq_set_fifo_time(rq, jiffies + bfqd->bfq_fifo_expire[rq_is_sync(rq)]);
list_add_tail(&rq->queuelist, &bfqq->fifo);
-@@ -2663,18 +2920,6 @@ static void bfq_put_request(struct request *rq)
+@@ -3228,18 +3536,6 @@ static void bfq_put_request(struct request *rq)
}
}
@@ -857,7 +978,7 @@ index f5f71e4..0d3503d 100644
/*
* Returns NULL if a new bfqq should be allocated, or the old bfqq if this
* was the last process referring to said bfqq.
-@@ -2683,6 +2928,9 @@ static struct bfq_queue *
+@@ -3248,6 +3544,9 @@ static struct bfq_queue *
bfq_split_bfqq(struct bfq_io_cq *bic, struct bfq_queue *bfqq)
{
bfq_log_bfqq(bfqq->bfqd, bfqq, "splitting queue");
@@ -867,7 +988,7 @@ index f5f71e4..0d3503d 100644
if (bfqq_process_refs(bfqq) == 1) {
bfqq->pid = current->pid;
bfq_clear_bfqq_coop(bfqq);
-@@ -2711,6 +2959,7 @@ static int bfq_set_request(struct request_queue *q, struct request *rq,
+@@ -3276,6 +3575,7 @@ static int bfq_set_request(struct request_queue *q, struct request *rq,
struct bfq_queue *bfqq;
struct bfq_group *bfqg;
unsigned long flags;
@@ -875,9 +996,21 @@ index f5f71e4..0d3503d 100644
might_sleep_if(gfp_mask & __GFP_WAIT);
-@@ -2729,24 +2978,14 @@ new_queue:
+@@ -3293,25 +3593,26 @@ new_queue:
+ if (bfqq == NULL || bfqq == &bfqd->oom_bfqq) {
bfqq = bfq_get_queue(bfqd, bfqg, is_sync, bic, gfp_mask);
bic_set_bfqq(bic, bfqq, is_sync);
++ if (split && is_sync) {
++ if ((bic->was_in_burst_list && bfqd->large_burst) ||
++ bic->saved_in_large_burst)
++ bfq_mark_bfqq_in_large_burst(bfqq);
++ else {
++ bfq_clear_bfqq_in_large_burst(bfqq);
++ if (bic->was_in_burst_list)
++ hlist_add_head(&bfqq->burst_list_node,
++ &bfqd->burst_list);
++ }
++ }
} else {
- /*
- * If the queue was seeky for too long, break it apart.
@@ -902,7 +1035,7 @@ index f5f71e4..0d3503d 100644
}
bfqq->allocated[rw]++;
-@@ -2757,6 +2996,26 @@ new_queue:
+@@ -3322,6 +3623,26 @@ new_queue:
rq->elv.priv[0] = bic;
rq->elv.priv[1] = bfqq;
@@ -913,14 +1046,14 @@ index f5f71e4..0d3503d 100644
+ * queue has just been split, mark a flag so that the
+ * information is available to the other scheduler hooks.
+ */
-+ if (bfqq_process_refs(bfqq) == 1) {
++ if (likely(bfqq != &bfqd->oom_bfqq) && bfqq_process_refs(bfqq) == 1) {
+ bfqq->bic = bic;
+ if (split) {
+ bfq_mark_bfqq_just_split(bfqq);
+ /*
-+ * If the queue has just been split from a shared queue,
-+ * restore the idle window and the possible weight
-+ * raising period.
++ * If the queue has just been split from a shared
++ * queue, restore the idle window and the possible
++ * weight raising period.
+ */
+ bfq_bfqq_resume_state(bfqq, bic);
+ }
@@ -930,10 +1063,10 @@ index f5f71e4..0d3503d 100644
return 0;
diff --git a/block/bfq-sched.c b/block/bfq-sched.c
-index 999b475..e54ea33 100644
+index 2931563..6764a7e 100644
--- a/block/bfq-sched.c
+++ b/block/bfq-sched.c
-@@ -980,34 +980,6 @@ static struct bfq_queue *bfq_get_next_queue(struct bfq_data *bfqd)
+@@ -1091,34 +1091,6 @@ static struct bfq_queue *bfq_get_next_queue(struct bfq_data *bfqd)
return bfqq;
}
@@ -957,8 +1090,8 @@ index 999b475..e54ea33 100644
- bfq_update_budget(entity);
- bfq_update_vtime(bfq_entity_service_tree(entity));
- bfq_active_extract(bfq_entity_service_tree(entity), entity);
-- sd->active_entity = entity;
-- sd->next_active = NULL;
+- sd->in_service_entity = entity;
+- sd->next_in_service = NULL;
- entity->service = 0;
- }
-
@@ -969,38 +1102,69 @@ index 999b475..e54ea33 100644
{
if (bfqd->in_service_bic != NULL) {
diff --git a/block/bfq.h b/block/bfq.h
-index 3ca8482..c278796 100644
+index 050eeaf..a2022d0 100644
--- a/block/bfq.h
+++ b/block/bfq.h
-@@ -200,6 +200,8 @@ struct bfq_group;
+@@ -218,18 +218,21 @@ struct bfq_group;
+ * idle @bfq_queue with no outstanding requests, then
+ * the task associated with the queue it is deemed as
+ * soft real-time (see the comments to the function
+- * bfq_bfqq_softrt_next_start()).
++ * bfq_bfqq_softrt_next_start())
+ * @last_idle_bklogged: time of the last transition of the @bfq_queue from
* idle to backlogged
* @service_from_backlogged: cumulative service received from the @bfq_queue
- * since the last transition from idle to backlogged
+ * since the last transition from idle to
+ * backlogged
+ * @bic: pointer to the bfq_io_cq owning the bfq_queue, set to %NULL if the
+ * queue is shared
*
- * A bfq_queue is a leaf request queue; it can be associated with an io_context
- * or more, if it is async or shared between cooperating processes. @cgroup
-@@ -243,6 +245,7 @@ struct bfq_queue {
- sector_t last_request_pos;
+- * A bfq_queue is a leaf request queue; it can be associated with an io_context
+- * or more, if it is async or shared between cooperating processes. @cgroup
+- * holds a reference to the cgroup, to be sure that it does not disappear while
+- * a bfqq still references it (mostly to avoid races between request issuing and
+- * task migration followed by cgroup destruction).
++ * A bfq_queue is a leaf request queue; it can be associated with an
++ * io_context or more, if it is async or shared between cooperating
++ * processes. @cgroup holds a reference to the cgroup, to be sure that it
++ * does not disappear while a bfqq still references it (mostly to avoid
++ * races between request issuing and task migration followed by cgroup
++ * destruction).
+ * All the fields are protected by the queue lock of the containing bfqd.
+ */
+ struct bfq_queue {
+@@ -269,6 +272,7 @@ struct bfq_queue {
+ unsigned int requests_within_timer;
pid_t pid;
+ struct bfq_io_cq *bic;
/* weight-raising fields */
- unsigned long raising_cur_max_time;
-@@ -272,12 +275,23 @@ struct bfq_ttime {
+ unsigned long wr_cur_max_time;
+@@ -298,12 +302,42 @@ struct bfq_ttime {
* @icq: associated io_cq structure
* @bfqq: array of two process queues, the sync and the async
* @ttime: associated @bfq_ttime struct
-+ * @raising_time_left: snapshot of the time left before weight raising ends
-+ * for the sync queue associated to this process; this
-+ * snapshot is taken to remember this value while the weight
-+ * raising is suspended because the queue is merged with a
-+ * shared queue, and is used to set @raising_cur_max_time
-+ * when the queue is split from the shared queue and its
-+ * weight is raised again
-+ * @saved_idle_window: same purpose as the previous field for the idle window
++ * @wr_time_left: snapshot of the time left before weight raising ends
++ * for the sync queue associated to this process; this
++ * snapshot is taken to remember this value while the weight
++ * raising is suspended because the queue is merged with a
++ * shared queue, and is used to set @raising_cur_max_time
++ * when the queue is split from the shared queue and its
++ * weight is raised again
++ * @saved_idle_window: same purpose as the previous field for the idle
++ * window
++ * @saved_IO_bound: same purpose as the previous two fields for the I/O
++ * bound classification of a queue
++ * @saved_in_large_burst: same purpose as the previous fields for the
++ * value of the field keeping the queue's belonging
++ * to a large burst
++ * @was_in_burst_list: true if the queue belonged to a burst list
++ * before its merge with another cooperating queue
++ * @cooperations: counter of consecutive successful queue merges underwent
++ * by any of the process' @bfq_queues
++ * @failed_cooperations: counter of consecutive failed queue merges of any
++ * of the process' @bfq_queues
*/
struct bfq_io_cq {
struct io_cq icq; /* must be the first member */
@@ -1008,25 +1172,45 @@ index 3ca8482..c278796 100644
struct bfq_ttime ttime;
int ioprio;
+
-+ unsigned int raising_time_left;
-+ unsigned int saved_idle_window;
++ unsigned int wr_time_left;
++ bool saved_idle_window;
++ bool saved_IO_bound;
++
++ bool saved_in_large_burst;
++ bool was_in_burst_list;
++
++ unsigned int cooperations;
++ unsigned int failed_cooperations;
};
- /**
-@@ -418,8 +432,9 @@ enum bfqq_state_flags {
+ enum bfq_device_speed {
+@@ -539,7 +573,7 @@ enum bfqq_state_flags {
+ BFQ_BFQQ_FLAG_prio_changed, /* task priority has changed */
BFQ_BFQQ_FLAG_sync, /* synchronous queue */
BFQ_BFQQ_FLAG_budget_new, /* no completion with this budget */
+- BFQ_BFQQ_FLAG_IO_bound, /*
++ BFQ_BFQQ_FLAG_IO_bound, /*
+ * bfqq has timed-out at least once
+ * having consumed at most 2/10 of
+ * its budget
+@@ -552,12 +586,13 @@ enum bfqq_state_flags {
+ * bfqq has proved to be slow and
+ * seeky until budget timeout
+ */
+- BFQ_BFQQ_FLAG_softrt_update, /*
++ BFQ_BFQQ_FLAG_softrt_update, /*
+ * may need softrt-next-start
+ * update
+ */
BFQ_BFQQ_FLAG_coop, /* bfqq is shared */
- BFQ_BFQQ_FLAG_split_coop, /* shared bfqq will be splitted */
-- BFQ_BFQQ_FLAG_softrt_update, /* needs softrt-next-start update */
+ BFQ_BFQQ_FLAG_split_coop, /* shared bfqq will be split */
+ BFQ_BFQQ_FLAG_just_split, /* queue has just been split */
-+ BFQ_BFQQ_FLAG_softrt_update, /* may need softrt-next-start update */
};
#define BFQ_BFQQ_FNS(name) \
-@@ -446,6 +461,7 @@ BFQ_BFQQ_FNS(sync);
- BFQ_BFQQ_FNS(budget_new);
+@@ -587,6 +622,7 @@ BFQ_BFQQ_FNS(in_large_burst);
+ BFQ_BFQQ_FNS(constantly_seeky);
BFQ_BFQQ_FNS(coop);
BFQ_BFQQ_FNS(split_coop);
+BFQ_BFQQ_FNS(just_split);
@@ -1034,5 +1218,5 @@ index 3ca8482..c278796 100644
#undef BFQ_BFQQ_FNS
--
-1.9.0
+2.1.3
next reply other threads:[~2015-01-09 16:18 UTC|newest]
Thread overview: 85+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-01-09 16:18 Mike Pagano [this message]
-- strict thread matches above, loose matches on Subject: below --
2016-12-11 22:31 [gentoo-commits] proj/linux-patches:3.14 commit in: / Mike Pagano
2016-09-11 17:39 Mike Pagano
2016-09-09 19:22 Mike Pagano
2016-08-20 16:29 Mike Pagano
2016-08-17 12:18 Mike Pagano
2016-08-10 12:53 Mike Pagano
2016-07-27 19:15 Mike Pagano
2016-06-24 20:37 Mike Pagano
2016-06-08 11:21 Mike Pagano
2016-06-02 18:01 Mike Pagano
2016-05-19 12:38 Mike Pagano
2016-05-12 0:07 Mike Pagano
2016-05-04 23:46 Mike Pagano
2016-04-20 10:10 Mike Pagano
2016-04-12 19:01 Mike Pagano
2016-03-16 19:41 Mike Pagano
2016-03-10 0:49 Mike Pagano
2016-03-04 0:16 Mike Pagano
2016-02-25 23:29 Mike Pagano
2016-02-17 23:58 Mike Pagano
2016-01-31 21:34 Mike Pagano
2016-01-23 18:58 Mike Pagano
2016-01-20 15:13 Mike Pagano
2015-12-10 13:52 Mike Pagano
2015-11-10 0:05 Mike Pagano
2015-10-27 13:38 Mike Pagano
2015-10-23 19:40 Mike Pagano
2015-10-01 13:18 Mike Pagano
2015-09-21 17:37 Mike Pagano
2015-09-14 16:23 Mike Pagano
2015-08-17 16:37 Mike Pagano
2015-08-10 23:13 Mike Pagano
2015-08-03 22:33 Mike Pagano
2015-07-17 15:34 Mike Pagano
2015-07-10 23:40 Mike Pagano
2015-07-07 0:44 Mike Pagano
2015-06-30 14:34 Mike Pagano
2015-06-23 17:10 Mike Pagano
2015-06-06 21:34 Mike Pagano
2015-05-18 19:33 Mike Pagano
2015-05-13 19:23 Mike Pagano
2015-05-08 12:14 Mike Pagano
2015-04-29 17:04 Mike Pagano
2015-04-20 9:42 Mike Pagano
2015-04-14 9:50 Mike Pagano
2015-03-28 20:25 Mike Pagano
2015-03-26 20:52 Mike Pagano
2015-03-19 12:42 Mike Pagano
2015-03-07 14:45 Mike Pagano
2015-02-27 14:34 Mike Pagano
2015-02-14 21:11 Mike Pagano
2015-02-11 15:16 Mike Pagano
2015-02-07 1:28 Mike Pagano
2015-01-30 11:12 Mike Pagano
2015-01-28 22:16 Anthony G. Basile
2015-01-28 22:01 Anthony G. Basile
2015-01-17 0:55 Mike Pagano
2015-01-09 18:28 Mike Pagano
2015-01-02 19:10 Mike Pagano
2014-12-16 20:29 Mike Pagano
2014-12-09 23:03 Mike Pagano
2014-11-23 12:07 Anthony G. Basile
2014-11-22 20:16 Mike Pagano
2014-11-15 0:32 Mike Pagano
2014-10-30 22:56 Mike Pagano
2014-10-30 22:42 Mike Pagano
2014-10-15 15:43 Mike Pagano
2014-10-09 23:03 Mike Pagano
2014-10-06 15:44 Mike Pagano
2014-09-17 19:59 Anthony G. Basile
2014-09-09 22:16 Vlastimil Babka
2014-08-19 11:44 Mike Pagano
2014-08-08 18:30 ` Mike Pagano
2014-08-14 12:44 Mike Pagano
2014-08-19 11:44 ` Mike Pagano
2014-08-02 0:19 Mike Pagano
2014-08-19 11:44 ` Mike Pagano
2014-07-28 19:17 Mike Pagano
2014-08-19 11:44 ` Mike Pagano
2014-07-18 12:05 Mike Pagano
2014-07-09 23:09 Mike Pagano
2014-07-08 18:04 Mike Pagano
2014-07-01 12:08 Mike Pagano
2014-06-27 15:00 Mike Pagano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1420820324.0b8b3489401caaf8a95a16b8636504a3179aa437.mpagano@gentoo \
--to=mpagano@gentoo.org \
--cc=gentoo-commits@lists.gentoo.org \
--cc=gentoo-dev@lists.gentoo.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox