This paper proposes a prototype emulation system based on Xilinx LCA for digital signal transport circuits. This system has a high speed interface, over 150Mbps, to emulate digital signal transport circuits at real-time operation speeds. This paper also introduces a design method that uses a multiple-FPGA system to emulate large scale digital circuits for high speed communication such as digital signal transport. Excellent emulator performance is achieved with the introduction of a circuit partitioning technique that utilizes the high-level partitioning described by the designer, and an automatic latch insertion technique. Experimental results show that the proposed method achieves the high-speed emulation of digital signal transport circuits at the rate of 156 Mbps.
FPGA-based ASIC development systems have become important
tools in contemporary ASIC design. Existing systems exhibit
low per-FPGA gate utilization (10 to 20 percent)
due to limited inter-chip communication. Attempts at overcoming
this limitation through the use of high dimensional
interconnection topologies have met with limited success.
This paper focuses on the prototype hardware and software
interfaces that have been developed for an FPGA-based
ASIC emulation system based on a new technique for overcoming
inter-chip communication limitations. This tech-
nique, referred to as virtual wires, intelligently multiplexes
each physical FPGA wire among a number of logical wires.
The Virtual Wires Emulation System exhibits high FPGA
gate utilization while achieving system speeds comparable
to existing logic emulators. A two-dimensional mesh interconnection
topology of FPGAs is used to eliminate the cost
of signal switching elements and to facilitate scalability.
A system capable of emulating 20,000 gates has been
constructed for under $3000. This system includes both
prototype emulation hardware and a Virtual Wires netlist
compiler. Currently, this system is being used as both a
simulation accelerator for the LSI Logic LSIM and Cadence
Verilog simulators and as an in-circuit emulator. Results
from mapping netlists, such as the 18K gate Sparcle microprocessor
[2], to this system for simulation acceleration and
in-circuit emulation indicate that virtual wires can substantially
increase FPGA utilization without adversely affecting
emulation speed.
Keywords: FPGA, logic emulation, prototyping, mesh
topology, static routing, virtual wires.
Field-Programmable Gate Arrays (FPGAs) are rapidly replacing low end Gate-Arrays because of their fast turn around time and low cost. However, system level Printed Circuit Board (PCB) concerns such as package partitioning and inter-package routing remain as difficult as ever. In this paper, we present a fast prototyping system that uses inexpensive high-performance Field Programmable Interconnect Chips (FPIC). The FPIC form a fixed crossbar that is 100% routable when used in cooperation with automatic package partitioning and crossbar routing algorithms. Regular PCB performance can be obtained in less time to market and less cost.
Multi-chip, board-level designs form a large portion of today's digital system designs. Unfortunately, traditional methods for debugging these designs, such as wire wrap, prototype fabrication and software simulation, are inadequate. Wire-wrap is complex and error-prone, prototype fabrication is time-consuming and it is difficult to isolate errors, and simulation is too slow fir full testing. Recent advances in FPGA-based systems offer hope for significant improvements in board-level prototyping. In this paper, we present Springbok, an integrated system for board-level prototyping that promises near-speed testing with the construction and modification simplicity of software simulation. It is comprised of an integrated software system for mapping designs to a flexible, extensible hardware implementation platform. This allows designs to take advantage of FPGA flexibility while using many of the complex chips that will implement the final design.
We propose two novel approaches to direct multiple-way circuit partitioning. Each approach includes a generic algorithm. In this paper, we discuss one of those algorithms in detail but we also give the basis of the other algorithm. The generic algorithm is used to generate three algorithms. The proposed algorithms outperform the other Kernighan-Lin based circuit partitioning algorithms on benchmark circuits. Since circuit partitioning plays an important role in the implementation of FPGA logic designs, the proposed algorithms are expected to provide great help in this process.
FPGA packages have maximum Size constraints much larger than the number of 10 pins. The resulting IO bottleneck during circuit partitioning means more required packages and more ordinary signal wires crossing. between the packages. This cuts more critical timing paths between packages and drastically decreases the circuit operational frequency. In this paper, a bottom-up circuit partitioning method with cone structures is presented. The solution can minimize the delay of critical paths that slow down the circuit without changing the number of partitioned packages.
This paper proposes an architecture that implements a hierarchical routing structure for FPGAs, called a hierarchical FPGA (HFPGA). The architectural parame- ters of this architecture are defined, and two switch matrix topologies for routing channels proposed. A set of new tools has been used to place and route several circuits on this architecture, with the goal of comparing the cost of HFPGAs to conventional symmetrical FPGAs. The results show that HFPGAs can implement circuits with fewer routing switches, and fewer switches in total, compared to symmetrical FGPAs, although they have the potential disadvantage that they may require more logic blocks due to coarser granularity.
This paper describes the development of a new fine grained sea-of-gates FPGA architecture. The architecture has been developed simultaneously with a proprietary set of autolayout tools and has been designed to be sympathetic towards these tools. New structures have been added that aid the autolayout process and the interaction of these structures with the software is explained. In addition the paper describes some of the problems traditionally associated with the fine grained approach and looks at how these have been addressed in this new architecture.
This paper describes a new Field-Programmable Gate Array (FPGA) architecture which reduces the density gap between FPGAs and Mask-Programmed Gate Arrays (MPGAs) for datapath oriented circuits. This is primarily achieved by operating on data as a number of four-bit slices. The interconnection network incorporates distinct sets of resources for routing control and data signals. These features reduce circuit area by sharing programming bits among four-bit slices, lowering the total number of storage cells required. Experimental results suggest that the implementation area of a design's datapath can be reduced by a factor of two through the use bit sharing. This paper discusses the requirements of logic blocks and routing structures that can be used to implement typical circuits containing a number of regularly structured datapaths of various sizes, as well as a small number of irregularities.
This paper proposes a method to reduce the size of 'n' input FPGA look up tables (LUTs) to less than 2n bits. The reduced LUT (RLUT) requires more inputs, which also increases the functionality of the LUT. The increase in the number of inputs is a trade-off against the smaller size of the RLUT. RLUTs for n = 4 should decrease total chip area while increasing logic capacity. The RLUT performance is equal to or better than standard LUTs.
This paper investigates very fine-grained architectures for a new family of FPGAs and compares different VLSI solutions. The new architectures take advantage of Binary Decision Trees (BDTs) for logic synthesis and Binary Decision Diagrams (BDDs) for logic optimization. Implementation can further be improved by correct ordering of the decision variables in the vertices (OBDDs). The two very fine-grained building cells, proposed herewith, can automatically implement the OBDD structure of any given function. They realize a direct MUX/DMUX implementation of OBDDs. As the topology of BDDs reflects a cellular organization, this technique allows the suppression of the technology mapping phase needed with conventional FPGAs. Two examples of the PREP benchmarks are summarized in this paper. Performance and cost of the new architectures are then discussed as well as the benefit brought by the MUX/DMUX solution from a power consumption point of view. Finally we present open avenues and conclusions.
In this paper we present an efficient algorithm for performance driven technology mapping for table-look-up based FPGA architectures using the general delay model. Since our algorithm does not impose any restrictions on the allowed values of edge delays, it can use delay information generated by a layout phase to intelligently guide the technology mapping in an iterative loop. The algorithm uses a set of node weights to measure the criticality of the nodes in a boolean network. A biased depth first search is then used to generate a topological ordering of the nodes and this ordering is used to guide the clustering step. An important feature of our algorithm is that it exploits reconvergent paths in the network automatically. The technology mapped circuits produced by our algorithm were compared with those produced by other algorithms. In terms of delay, our algorithm produces results with delays that are 30% less than those produced by flowmap on the average, showing that minimizing the number of levels in the technology mapped circuit does not ensure a low delay realization. In terms of the number of logic blocks, our results are comparable to those produced by flowmap.
Truly heterogenous FPGAs, those with two different kinds of logic block, don't exist in the commercial world in part because logic synthesis for them is difficult. The difficulty arises because FPGAs are pre-fabricated, and so the ratio of the number of each type of block is fixed, which requires a constraint on the mapping that is not present for either homogenous FPGAs or ASICs. This paper presents a general approach for technology mapping into heterogenous lookup-table based FPGAs. It is applied both at the boolean network node level and at a more global level across a collection of mapped sub-circuits. This latter portion of the algorithm is optimal, and is applicable to any heterogenous FPGA, not simply those based on lookup tables. In comparison to a reasonably obvious alternative approach (adjusting the output of an homogenous mapping algorithm) the new algorithm achieves, depending on the heterogenous architecture, an average improvement of up to 48% over a large set of benchmark circuits.
This paper addresses the problem of technology mapping to lookup-table based field-programmable gate arrays. It uses a different approach than others for the decomposition of the network into feasible nodes, the Davio expansion. To perform this it uses an efficient representation of the decomposed nodes, Functional Decision Diagrams (FDD). The influence of FDD optimisation methods to the mapped solution is shown. Certain methods for the covering phase into lookup tables were discussed, which are targeted to area, delay and mutability optimisation.
A graph-based technology mapper for optimizing area in lookup table-based FPGA. designs has been described. It is based on an efficient circuit partitioning algorithm which has been developed with the goal of minimizing the number of lookup tables required to implement a given circuit. The partitioned circuit is then merged using a heuristic algorithm for the matching problem. These algorithms and heuristics are described in detail. Some results obtained on the ISCAS85 benchmark circuits are included.
This paper presents an architecture independent method of technology mapping for table-lookup FPGAs. The internal architecture of logic blocks is a parameter for the mapping step. The first part of the paper describes a new mapping algorithm, which results were compared to existing mapping tools for the Xilinx 4000 series. In a second step the tool was used to examine different new architectures with respect to chip area and delay minimization. The mapping of those architectures showed an improved factor between gate equivalent and used chip area in comparison to commercially available architectures.
This paper reports on mapping methods for FPGA working directly on Binary Decision Diagram and will be illustrated both in a multiplexor and LUT based FPGAs respectively Actel and Altera Flex products. The Actel mapper is presently included in the Actel design kit.
This paper deals with the selection of state assignment options for finite state machines implemented in various targets. This choice depends on the intrinsic controller complexity, the partitioning strategy, the area/delay constraints and the target technologies. Industrial controllers as well as a sub-set of MCNC benchmarks have been synthesized and analyzed after place and route to define the selection strategy.
One of the main objectives in the process of mapping a digital circuit onto a LUT-based
FPGA structure is minimizing the total number of lookup-tables needed to implement
the circuit. This will increase the size of the circuit that can be implemented using the
available FPGA structure. In this paper we show that even restricted cases of the lookup-table
minimization for FPGA technology mapping are NP-complete (even when K is a
small constant), and that it can be solved optimally for all values of K on a tree input
in O(min{nlogn, nK}) time where n is the number of nodes in the network and K is
the input capacity of the LUTs. We study this because most practical digital circuits are
"close to tree" in topology. standpoint. Furthermore, our algorithm for the trees forms
the basis of our algorithm for general circuits. Based on our algorithm for trees we present
a polynomial time heuristic algorithm for general Boolean networks. Experimental results
confirm substantial decrease on the number of LUTs on a number of MCNC logic synthesis
benchmarks. We obtain 10% to 80% improvement on the number of LUTs compared to
the previous algorithms (even though we allow very limited operations, e.g., we do not
exploit Boolean properties of the network).
Key words: Field Programmable Gate Array, Technology Mapping, Lookup-Table Mini-
mization, NP-completeness, Directed Acyclic Graphs.
An area-efficient technique for implementation of a field-programmable analog array is proposed. The design consists of configurable analog blocks (CABs) that contain op-amps and/or passive elements. The connections between CABs are realized using four-transistor MOS transconductors. The conductance is controlled by varying the gate voltages as defined by either multi-valued memories or external/internal signals. A 1.2 µm CMOS IC with four CABs and 48 transconductors has been fabricated. Several analog circuits including: filter biquad, VCO, precision rectifier, quadrature oscillator, four-quadrant multiplier, square root circuit and sample-and-hold circuit have been successfully prototyped and tested on this IC.
In this paper we address the problem of the best architecture for high speed (high-frequency) field programmable analog array (FPAA). An architecture for current-mode FPAA is presented, which is suitable for a wide class of circuits, including continuous-time filters, differential and algebraic equation solvers, dynamic systems, and fuzzy and multi-valued logic circuits. Examples demonstrate an implementation of a ladder continuous-time filter, an analog rank filter, a circuit for tracking a product of two matrices, a linear equations solver, and a circuit for solving a linear programming problem by the method of steepest descent. The device achieves high performance through the reduced use of global signal interconnections, in favor of local ones. The FPAA is being implemented in a bipolar transistor array technology.
A new approach to redundancy for Field Programmable Gate Arrays (FPGAs) is proposed in this paper. The scheme is a novel reconfiguration network drawing on work in the design of wafer scale SIMD processors. The scheme calls for modifications to the wiring segments as well as the inclusion of a spare element at the end of each row. The proposed approach has removed the need for poly (anti) fuses by use of a dedicated on-chip reconfiguration processor. We show that using such a technique it should be possible to construct arrays up to 10 times larger than commercial devices with no detrimental effect on yield. The scheme has been designed to be applicable to any SRAM based FPGA and to keep full software compatibility with existing configuration tools.
In this paper, we present a new FPGA based on a multiplexer cell with selfdiagnosis and selfrepair capabilities. It is shown that the regular structure of FPGAs is particularly well suited for reconfigurable circuits. Error detection and reconfiguration mechanisms using spare cells are also presented.
There is currently great interest in' using fixed arrays of FPGAs for logic emulators, custom computing devices, and software accelerators. An important part of designing such a system is determining the proper routing topology to use to interconnect the FPGAs. This topology can have a great effect on the area and delay of the resulting system. Tree, Bipartite Graph, and Mesh interconnection schemes have all been proposed for use in FPGA-based systems. In this paper we examine Mesh interconnection schemes, and propose several constructs for more efficient topologies. These reduce inter-chip routing costs by more than 50% over the basic 4-way Mesh.
A new application of field programmable gate-ways is featured in the prototype of the University of Hawaii parallel computer (HPC). User programs are compiled and then executed partly or completely in one or more field programmable gate-arrays (FPGAs). Some compile-for-FPGA systems have yet to effectively implement full high-level language loop constructs. In this paper we show how the conditional logic used for generating jump addresses can be consolidated into one jump address calculation and computed in a FPGA. In addition, we show how this coprocessing can be easily added to reduce the number of memory transactions and cycles it takes to execute a program. An overview of the HPC shared memory architecture is presented. A shortest path programming example begins by describing the steps for automatically generating the FPGA source code and concludes with the results of a load-store analysis comparing execution with and without the FPGA.
Field-Programmable Gate Arrays (FPGAs) and Single-Instruction Multiple-Data (SIMD) processing arrays share many architectural features. In both architectures, an array of simple, fine-grained logic elements is employed to provide high-speed, customisable, bit-wise computation. In this paper, we present a unified computational array model which encompasses both FPGAs and SIMD arrays. Within this framework, we examine the differences and similarities between these array structures and touch upon techniques and lessons which can be transfered between the architectures. The unified model also exposes promising prospects for hybrid array architectures. We introduce the Dynamically Programmable Gate Array (DPGA) which combines the best features from FPGAs and SIMD arrays into a single array architecture.
We consider a switch module routing problem for FPGAs. This problem was introduced in [7] for evaluating switch module designs. Its solution is also useful for FPGA routabilty analysis. Only an approximate algorithm was previously known for this problem. In this paper, we present an optimal algorithm for the problem based on integer linear programming. Experimental results consistently show that our algorithm is very efficient in practical sized switch modules. We also identify interesting special cases of the problem which can be solved in polynomial time.
Unlike traditional ASIC technologies, the geometrical structures of clock trees in an FPGA are usually fixed and cannot be changed for different circuit designs. Moreover, the clock pins are connected to the clock trees via programmable switches. As a result, the load capacitances of a clock tree may be changed depending on the utilization and distribution of logic modules in an FPGA. To minimize clock skew, it is necessary to distribute the load capacitances, or equivalently the logic modules used by the circuit design, carefully according to the circuit design. In this paper we present an algorithm for selecting logic modules used for circuit placement such that the clock skew is minimized. The algorithm uses a dynamic programming technique and can be applied to the clock tree architectures used in commercial FPGAs. Furthermore, the algorithm can be extended to handle buffered clock trees and multi-phase clock trees. Experimental results show that the algorithm can reduce clock skews significantly as compared with the traditional placement algorithms which do not consider clock skew minimization.
In segmented channel routing of row-based FPGAs [6, 7], the routability and interconnection delays depend on the choice of the upper bounds on the number of programmable switches used in routing connections in the channel. In this paper, we present an algorithm for determining such upper bounds for all connections of a net simultaneously so that the predefined source-to-sink delay bound of the net is satisfied and the routability of the net is maximized. Distributed RC delay model is used to compute source-to-sink delay. The algorithm can also be applied to timing-driven routing of symmetrical-array based FPGAs and FPICs. Preliminary experimental results show that our algorithm can significantly improve routability and delay bound satisfiability as compared with the traditional uniform switch bound allocation approach [6, 7].
Logic implemented in an SRAM-based FPGA is reconfigurable; that is, changes can be made to the system's logic functions by reprogramming the FPGA(s) in the system. Examples are cited of systems that make use of this in-system reconfigurability. These applications can be divided into three main categories based on how the FPGA's reconfigurability is applied: systems with built-in diagnostics, adaptable system designs, and systems with multi-purpose hardware.
Off-the-shelf arithmetic chips provide high speed but offer only a limited choice of precisions. With Field Programmable Gate Arrays (FPGAs), good performance arithmetic hardware can be developed in precisions where custom chips are unavailable. We present a modular multiplier design for the Xilinx XC4010 FPGA that is easily tailored to any precision. Each module extends the precision in 4-bit increments, and modules are appended as needed. Our design avoids the common problem of large fanout delay in the critical path to provide a cycle time that is independent of precision. High precision multipliers span multiple chips without additional delays or additional logic block modules for interfacing.
We show how field-programable gate arrays can be used to efficiently implement neural nets. By implementing the training phase in software and the actual application in hardware, conflicting demands can be met: training benefits from a fast edit-debug cycle, and once the design has stabilized, a hardware implementation results in higher performance. While neural nets have been implemented in hardware in the past, larger digital nets have not been possible due to the real-estate requirements of single neurons. We present a bit-serial encoding scheme and computation model, which allows space-efficient computation of the sum of weighted inputs, thereby facilitating the implementation of complex neural networks.
The Reconfigurable Machine (RM) is a parallel architecture that is housed on the IBM/AT machine as a host computer. RM is built using of Xilinx's 4005 Field Programmable Gate Array (FPGA) and associated fast SRAMs. This paper describes the architecture and implementation of the Local Controller, which is also implemented using Xilinx's 4005 FPGA, specifically optimized to handle the communication between the RM and host computer. The Xilinx FPGA and it's associated CAD system XACT, together with the Mentor Graphics EDA system is used for implementing the control circuit easily and efficiently.
We consider the problem of circuit partitioning for multiple-chip implementation. One motivation for studying this problem is the current needs of good partitioning tools for implementing a circuit on multiple FPGA chips. We allow replication of logic gates as it would reduce circuit delay. Circuit partitioning with replication of logic gates is also called circuit clustering. In this paper, we present a circuit clustering algorithm that minimizes circuit delay subject to area and pin constraints on each chip, under the general delay model. Our algorithm is optimal under either the area constraint only or the pin constraint only. We tested our algorithm on a set of benchmark circuits and consistently obtained optimal or near-optimal results.
Min-cut replication is an optimization that has been recently identified as a method for improving partitions with application, for example, to logic emulation systems based on field-programmable gate arrays. By using unused gates on the FPGAs, min-cut replication can reduce the number of pins required, allowing relaxation of the partitioning pin constraint. The increased partitioning flexibility can result in partitions containing significantly fewer FPGAs. We describe Ti.PIa, a software tool incorporating rain-cut replication, and present the results of applying min-cut replication to designs partitioned by commercial software that itself introduces replication for timing correctness. The results indicate that min-cut replication can reduce the number of FPGAs required to implement a large design.
In this paper, we revisit the classical problem of functional decomposition [1, 2J that arises so often in logic synthesis. One basic problem that has remained largely unaddressed to the best of our knowledge is that of decomposing a function such that the resulting sub-functions are simple, i.e., have small number of cubes or literals. In this paper, we show how to solve this problem. We demonstrate that this problem is intimately related to the encoding problem, which is also of fundamental importance in sequential synthesis, especially state-machine synthesis. We formulate the decomposition problem using encoding. In general, an input-output encoding formulation has to be employed to solve the problem. However, we show that for programmable gate array architectures which use look-up tables, the input encoding formulation suffices, provided we use minimum-length codes. The last condition is actually not a constraint, since each extra code bit means that an extra table has to be used (and that could be expensive). The unused codes are used as don't cares for simplifying the sub-functions. We compare the original implementation of functional decomposition, which ignores the encoding problem, with the new version that uses encoding while doing decomposition. We obtain an average improvement of over 20% on a set of standard benchmarks for look-up table architectures.
This paper presents a vertex ordering heuristic which speeds up the search process in partitioning-based fitting algorithm. The fitting algorithm maps a sequential circuit onto a special application EPLD device, CY7C361 from Cypress Semiconductor. The same approach can also be used for other EPLD and FPGA architectures. The vertex ordering heuristic has decreased the search time from hours to seconds for more complex designs. The quality of results is not effected as the algorithm remains exact.
For any technology targets, it is very important to make a consistent use of library macro blocks. It is obvious that the FPGA vendors designed carefully the basic blocks and that for some targets placement and routing is pretty well handled or prepared for these blocks. Therefore, and especially for FPGA targets, the resynthesis of arithmetic operators would lead to poor results compared to designs where the library macro blocks are used. It seems very important to realize a good link from a VHDL description to fully use library macro block capabilities. In addition, most of the designers use a meet-in-the-middle design methodology. This means that they design some basic, dedicated, efficient basic blocks and they assemble or use them in a more complex design. Therefore it is very important to support a VHDL description using the functionalities of the predesigned basic blocks. The ASYL RTL VHDL synthesis system presented in this paper provides facilities for both library macro blocks handling and predesigned designees blocks.
Field-programmable gate arrays (FPGAs) are an inexpensive and flexible "low risk" design alternative to custom integrated circuits. While FPGA partitioning and technology mapping have been well-studied, FPGA routing has received much less attention. In this paper we propose a unified general framework for FPGA routing, which allows simultaneous optimization of multiple competing objectives under a smooth designer-controlled tradeoff. Our approach is based on a new and general multi-weighted graph formulation, enabling a good theoretical performance characterization, as well as an effective, practical implementation. Our router is architecture-independent, computationally efficient, and performs very well on industrial benchmarks. Finally, our multi-weighted graph formulation is quite general and is directly applicable to many other areas of CAD.
An FPGA layout algorithm is presented, which deals with placement and global routing simultaneously by fully exploiting its regular structure. It is based on a simple and fast top-down hierarchical bi-partitioning, with placement and global routes represented by positions of logic-blocks and pseudo-blocks, respectively. For each critical signal path, its upper bound in length is given as a constraint. During the bi-partitioning, the algorithm calculates estimated lengths of the paths and, based on them, partitions the set of blocks so as to satisfy the constraints. In this way, the signal path length is controlled during layout design. Experimental results for several benchmark circuits demonstrates its efficiency and effectiveness.
Two existing methods are used in computing interconnection delays for FPGAs. The first method, based on SPICE-like circuit simulation, provides very accurate results but requires intensive computations. The second method, based on (some variation of) a bounding box perimeter wire length estimate for the routing, is extremely fast. However, its inaccuracy is high (up to 50 % ) for FPGAs. Neither methods can be incorporated into optimization applications requiring reasonable accurate delays and a large number(100,000 or more) of delay estimates. In this paper, we propose a third method, applicable to FPGAs with segmented channels to quickly estimate net delays accurately (within 10% of SPICE). This method allows place and route (and other applications) to take advantage of the more accurate delays during iterative operations.
An accurate timing model plays a key role in performance driven synthesis, placement and routing besides timing verification. Timing modeling for antifuse based FPGAs is different from standard cells or gate arrays due to technology and architecture differences. In this paper, the methodology followed to develop a timing model for an enthuse based FPGA architecture is presented. The methodology includes extraction of Interconnect parasitics in the form of RC trees from the place and route results. A reduced order approximation obtained for the driving point admittance of RC trees Is coupled with SAS equations to compute load dependent delay and output slew. Rapid Interconnect Circuit Evaluator(RICE) is used for computing the wire delay. Experimental results of using the timing model to find correlation between delay and fanout is presented.
In this paper, we consider the problem of mapping a given global routes to detailed routes
(tracks) for two dimensional homogeneous FPGAs. We prove that for an arbitrary fixed
switch box topology of flexibility three, this problem remains NP-complete with or without
doglegs.
Index Term: FPGA Detailed Routing, NP-Complete, Switch Box Topology, multi-Dimensional
Interval Packing.