.. Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. .. include:: ../../common.defs .. _developer-cache-tiered-storage: Tiered Storage ************** Tiered storage is an attempt to allow |TS| to take advantage of physical storage with different properties. This design concerns only mechanism. Policies to take advantage of these are outside of the scope of this document. Instead we will presume an *oracle* which implements this policy and describe the queries that must be answered by the oracle and the effects of the answers. Beyond avoiding question of tier policy, the design is also intended to be effectively identical to current operations for the case where there is only one tier. The most common case for tiers is an ordered list of tiers, where higher tiers are presumed faster but more expensive (or more limited in capacity). This is not required. It might be that different tiers are differentiated by other properties (such as expected persistence). The design here is intended to handle both cases. The design presumes that if a user has multiple tiers of storage and an ordering for those tiers, they will usually want content stored at one tier level to also be stored at every other lower level as well, so that it does not have to be copied if evicted from a higher tier. Configuration ============= Each :term:`storage unit` in :file:`storage.config` can be marked with a *quality* value which is 32 bit number. Storage units that are not marked are all assigned the same value which is guaranteed to be distinct from all explicit values. The quality value is arbitrary from the point of view of this design, serving as a tag rather than a numeric value. The user (via the oracle) can impose what ever additional meaning is useful on this value (rating, bit slicing, etc.). In such cases, all :term:`volumes ` should be explicitly assigned a value, as the default (unmarked) value is not guaranteed to have any relationship to explicit values. The unmarked value is intended to be useful in situations where the user has no interest in tiered storage and so wants to let |TS| automatically handle all volumes as a single tier. Operations ========== After a client request is received and processed, volume assignment is done. For each tier, the oracle would return one of four values along with a volume pointer: ``READ`` The tier appears to have the object and can serve it. ``WRITE`` The object is not in this tier and should be written to this tier if possible. ``RW`` Treat as ``READ`` if possible, but if the object turns out to not in the cache treat as ``WRITE``. ``NO_SALE`` Do not interact with this tier for this object. The :term:`volume ` returned for the tier must be a volume with the corresponding tier quality value. In effect, the current style of volume assignment is done for each tier, by assigning one volume out of all of the volumes of the same quality and returning one of ``RW`` or ``WRITE``, depending on whether the initial volume directory lookup succeeds. Note that as with current volume assignment, it is presumed this can be done from in memory structures (no disk I/O required). If the oracle returns ``READ`` or ``RW`` for more than one tier, it must also return an ordering for those tiers (it may return an ordering for all tiers, ones that are not readable will be ignored). For each tier, in that order, a read of cache storage is attempted for the object. A successful read locks that tier as the provider of cached content. If no tier has a successful read, or no tier is marked ``READ`` or ``RW`` then it is a cache miss. Any tier marked ``RW`` that fails the read test is demoted to ``WRITE``. If the object is cached, every tier that returns ``WRITE`` receives the object to store in the selected volume (this includes ``RW`` returns that are demoted to ``WRITE``). This is a cache to cache copy, not from the :term:`origin server`. In this case, tiers marked ``RW`` that are not tested for read will not receive any data and will not be further involved in the request processing. For a cache miss, all tiers marked ``WRITE`` will receive data from the origin server connection (if successful). This means, among other things, that if there is a tier with the object all other tiers that are written will get a local copy of the object, and the origin server will not be used. In terms of implementation, currently a cache write to a volume is done via the construction of an instance of :cpp:class:`CacheVC` which receives the object stream. For tiered storage, the same thing is done for each target volume. For cache volume overrides (via :file:`hosting.config`) this same process is used except with only the volumes stripes contained within the specified cache volume. Copying ======= It may be necessary to provide a mechanism to copy objects between tiers outside of a client originated transaction. In terms of implementation, this is straight forward using :cpp:class:`HttpTunnel` as if in a transaction, only using a :cpp:class:`CacheVC` instance for both the producer and consumer. The more difficult question is what event would trigger a possible copy. A signal could be provided whenever a volume directory entry is deleted, although it should be noted that the object in question may have already been evicted when this event happens. Additional Notes ================ As an example use, it would be possible to have only one cache volume that uses tiered storage for a particular set of domains using volume tagging. :file:`hosting.config` would be used to direct those domains to the selected cache volume. The oracle would check the URL in parallel and return ``NO_SALE`` for the tiers in the target cache volume for other domains. For the other tier (that of the unmarked storage units), the oracle would return ``RW`` for the tier in all cases as that tier would not be queried for the target domains.