Next: Extent Fragments, Previous: Zero-Length Extents, Up: Extents [Contents][Index]

The extents in a buffer are ordered by “display order” because that is that order that the redisplay mechanism needs to process them in. The e-order is an auxiliary ordering used to facilitate operations over extents. The operations that can be performed on the ordered list of extents in a buffer are

- Locate where an extent would go if inserted into the list.
- Insert an extent into the list.
- Remove an extent from the list.
- Map over all the extents that overlap a range.

(4) requires being able to determine the first and last extents that overlap a range.

NOTE: *overlap* is used as follows:

- two ranges overlap if they have at least one point in common. Whether the endpoints are open or closed makes a difference here.
- a point overlaps a range if the point is contained within the
range; this is equivalent to treating a point
*P*as the range*[P, P]*. - In the case of an
*extent*overlapping a point or range, the extent is normally treated as having closed endpoints. This applies consistently in the discussion of stacks of extents and such below. Note that this definition of overlap is not necessarily consistent with the extents that`map-extents`

maps over, since`map-extents`

sometimes pays attention to whether the endpoints of an extents are open or closed. But for our purposes, it greatly simplifies things to treat all extents as having closed endpoints.

First, define *>*, *<*, *<=*, etc. as applied to extents
to mean comparison according to the display order. Comparison between
an extent *E* and an index *I* means comparison between
*E* and the range *[I, I]*.

Also define *e>*, *e<*, *e<=*, etc. to mean comparison
according to the e-order.

For any range *R*, define *R(0)* to be the starting index of
the range and *R(1)* to be the ending index of the range.

For any extent *E*, define *E(next)* to be the extent directly
following *E*, and *E(prev)* to be the extent directly
preceding *E*. Assume *E(next)* and *E(prev)* can be
determined from *E* in constant time. (This is because we store
the extent list as a doubly linked list.)

Similarly, define *E(e-next)* and *E(e-prev)* to be the
extents directly following and preceding *E* in the e-order.

Now:

Let *R* be a range.
Let *F* be the first extent overlapping *R*.
Let *L* be the last extent overlapping *R*.

Theorem 1: *R(1)* lies between *L* and *L(next)*,
i.e. *L <= R(1) < L(next)*.

This follows easily from the definition of display order. The basic reason that this theorem applies is that the display order sorts by increasing starting index.

Therefore, we can determine *L* just by looking at where we would
insert *R(1)* into the list, and if we know *F* and are moving
forward over extents, we can easily determine when we’ve hit *L* by
comparing the extent we’re at to *R(1)*.

Theorem 2:F(e-prev) e< [1, R(0)] e<= F.

This is the analog of Theorem 1, and applies because the e-order sorts by increasing ending index.

Therefore, *F* can be found in the same amount of time as
operation (1), i.e. the time that it takes to locate where an extent
would go if inserted into the e-order list.

If the lists were stored as balanced binary trees, then operation (1) would take logarithmic time, which is usually quite fast. However, currently they’re stored as simple doubly-linked lists, and instead we do some caching to try to speed things up.

Define a *stack of extents* (or *SOE*) as the set of extents
(ordered in the display order) that overlap an index *I*, together
with the SOE’s *previous* extent, which is an extent that precedes
*I* in the e-order. (Hopefully there will not be very many extents
between *I* and the previous extent.)

Now:

Let *I* be an index, let *S* be the stack of extents on
*I*, let *F* be the first extent in *S*, and let *P*
be *S*’s previous extent.

Theorem 3: The first extent in *S* is the first extent that overlaps
any range *[I, J]*.

Proof: Any extent that overlaps *[I, J]* but does not include
*I* must have a start index *> I*, and thus be greater than
any extent in *S*.

Therefore, finding the first extent that overlaps a range *R* is
the same as finding the first extent that overlaps *R(0)*.

Theorem 4: Let *I2* be an index such that *I2 > I*, and let
*F2* be the first extent that overlaps *I2*. Then, either
*F2* is in *S* or *F2* is greater than any extent in
*S*.

Proof: If *F2* does not include *I* then its start index is
greater than *I* and thus it is greater than any extent in
*S*, including *F*. Otherwise, *F2* includes *I*
and thus is in *S*, and thus *F2 >= F*.

Next: Extent Fragments, Previous: Zero-Length Extents, Up: Extents [Contents][Index]