METH v1.0a:current Methodology Note: ASIC Design Partitioning
NOTE-019401 1994:05:19 39K Methodology A:0761 R:0761
ASIC Design Partitioning
Karen Vahtra, Synopsys Inc.
Methdology Notes, January 1994, Volume 2, Number 2
Partitioning within an ASIC at the correct level of granularity can greatly
help the synthesis process. This document discusses partitioning from the
standpoint of the ASIC. Areas discussed include how many partitions or modules
should exist in an ASIC, where in the logic these boundaries should occur,
and what type of logic should be kept together in one module. This partitioning
is largely not a functional partitioning but one that aids the synthesis process.
This document will discuss a number of partitioning recommendations. Different
goals lie behind the recommendations, and these goals are discussed first,
followed by the detailed recommendations. The partitioning recommendations
are not rules. With a complete understanding of the goals and results of the
recommendations, designers can make their own decisions on when it is appropriate
or not appropriate to follow each individual recommendation. Examples demonstrate
the difference between effective partitioning and poor partitioning. At the
end of some of the partitioning recommendations there are guidelines describing
where the recommendation can be relaxed for particular design styles.
Partitioning is much easier to fix early in the design cycle than later.
Partitioning can be changed manually by altering the interfaces and code
within modules or by using the commands in Design CompilerTM. The last
section discusses commands available in Design Compiler that can alter
partitioning.
Goals Behind the Partitioning Recommendations
Three main goals are behind the partitioning recommendations:
1. Produce the best synthesis results
2. Speed up the compile process
3. Simplify the constraint and script files for synthesis
The most significant goal behind the partitioning recommendations is to produce
better synthesis results. If logic is correctly arranged in modules and correctly
arranged in the hierarchy, good synthesis results are significantly easier
to obtain. A few partitioning recommendations address the goal of shortening
the compile step. A number of recommendations can really help in creating a
hierarchy that simplifies script automation. If strict partitioning recommendations
are followed, the designer only needs to create a few files for scripts and
constraints for the entire design.
Goal: Achieve the Best Synthesis Results
The strongest reason for altering partitioning is to get the best synthesis
results possible. Good partitioning can significantly help in meeting area
and performance constraints.
Five recommendations can help the designer to meet her or his constraints:
1. Keep related combinational logic together in the same module
2. Merge any sharable resources into the same module
3. Merge "user-defined resources" and the logic they drive into the same module
4. Separate out logic that has different design goals into separate blocks
5. Separate out logic that has different compile strategies into separate blocks
The following sections explain each of these five recommendations.
Recommendation: Keep Related Combinational Logic Together in the Same Module
The most important recommendation is to keep related combinational logic together
in the same module. Inter-module partitions can restrict logic optimization.
An example is shown in Figure 1.
The dashed box around the diagram indicates the module boundary. All of the
objects within the dashed line are within one module. The two symbols on the
right indicate two register banks. The three free-form clouds represent
combinational logic. Signals are represented as thin wires, and buses are
represented with thicker lines as shown in this diagram.
In this example, three combinational logic clouds are part of the critical
path or close to the critical path. Design Compiler has more flexibility in
optimizing a design if the related logic is in the same module and no artificial
boundaries exist. Hierarchical boundaries prevent any combining of related
logic. Without artificial boundaries as in Figure 1, Design Compiler can combine
related functions in the combinational clouds.
Example of Poor Partitioning
Figure 2 is an example of poor partitioning.
In Figure 2 the critical path is divided among two modules. This type of
partitioning leaves less flexibility for creating the best design. Design
Compiler cannot move logic across hierarchical boundaries during default
compile operations. In the last section of this methodology note, entitled
"Commands for Hierarchy Rearranging," a discussion covers how to remove these
restrictions using the ungroup command.
Also, the designer is forced to time budget the interconnect signals between
the modules very carefully. Time budgeting refers to allocating delays to paths
within a design among different partitions. To produce good results using the
design shown in Figure 2, the designer needs to time budget the interconnect
between these modules as shown in Figure 3.
In Figure 3, the designer needs to manually generate the time budget for this
design with a ten-nanosecond period. In Figure 3, six nanoseconds are used
in the left module, and four nanoseconds of the ten-nanosecond clock period
are used in the right module. This four-nanosecond period is indicated by the
set_output_delay on the output port of the left module, and the six-nanosecond
period is indicated by the set_input_delay on the input port of the right module.
The designer also needs to predict the gates that both drive and load this
net. After estimating the type of driver in the left module, the designer needs
to specify this value with the set_drive command on the right module's input
port. After predicting the amount of load this signal needs to drive, the designer
specifies this value with the set_load on the output port of the left module.
Relaxation of the Recommendation
This recommendation is less important in two cases. The first case occurs for
datapath designs where detailed time budgets are available for the intermediate
signals. The example in Figure 4 is a case where detailed time budgets are
relatively simple to create.
In this example, three modules exist. The left module is a datapath module
with two levels of muxing logic, represented by the quadrangles. Since the
expected logic is very predictable, the time budgets for the intermediate signal
can be easily derived from the vendor's databook before synthesis. The time
through the mux logic is relatively predictable, as is the drive strength of
the mux in the last stage of logic. The only attribute that is less predictable
for time budgeting is the amount of load on the intermediate net.
A secondary reason this design does not need to conform to the partitioning
recommendation is that the logic in the combinational elements probably does
not need to be merged with the mux logic. If the designer knows a series of
muxes is the best implementation on the left module, Design Compiler will not
produce better results by merging the datapath muxes with the random control
logic it feeds.
If these three modules are compiled together with a hierarchical compile approach,
the designer may not wish to conform to this partitioning recommendation. The
various compile approaches are discussed in the Methodology Notes, Volume 1
Number 3 in the article "Compile Methodologies for Hierarchical Designs."
In Figure 5, all three modules are enclosed in their own hierarchy. If
hierarchical compile is used to compile all of these modules at once, the
partitioning recommendation can be relaxed. By compiling these modules using
hierarchical compile, intermediate time budgets are automatically calculated
by Design Compiler. The hierarchical compile approach still restricts logic
optimization across boundaries, and it is not appropriate if the left moduleis random logic, as was the case in
Figure 2.
Recommendation: Merge "Resources" into the Same Module
A resource is most commonly thought of as a synthetic operator that can be
directly inferred from an HDL, as shown in the following code fragment:
VHDL Verilog
if CTL = '1' then if (CTL)
Z <= A + B; Z <= A + B;
else else
Z <= C + D; Z <= C + D;
end if;
Two adder resources are created in this example. One adder adds the signals
A and B together; the second adder adds the signals C and D together. Design
Compiler will chose whether to share the resources based upon the constraints.
If only an area constraint exists, Design Compiler will likely share the adders.
If performance is a consideration, the adders may or may not be merged.
For Design Compiler to consider resource sharing, all relevant resources need
to be within one level of hierarchy. If the resources are not within one level
of hierarchy, Design Compiler cannot make tradeoffs to determine whether or
not the resources should be shared. Figure 6 shows of a possible representation
of hierarchy that corresponds to this recommendation.
The two circles with the plus sign represent the adders. The quadrangle is
the mux that selects the correct sum depending upon the CTL signal.
Figure 6 is an example of good partitioning because the two adders are within
the same level of hierarchy. This partitioning allows Design Compiler full
flexibility when choosing to share or not share the adders.
Example of Poor Partitioning
Figure 7 is an example of poor partitioning. In this example, resources that
can be shared are separated by hierarchical boundaries.
In this example, four modules exist in this hierarchy. One module contains
only a subtractor, another module contains only an adder, another module contains
another adder, and the third module contains muxing logic and the register
bank. In this example, Design Compiler cannot combine the adders or the subtractor
because the resource-sharing algorithms do not work across hierarchical boundaries.
A better partitioning scheme would keep all of these elements in one level
of hierarchy to give Design Compiler full freedom. The adders could be merged
with the subtractor creating an adder/subtractor.
Also, time budgeting is required between the different modules. In this case,
the time budget is probably predictable given the datapath nature of the design.
Recommendation: Keep User-Defined Resources with the Logic They Drive
The term resources is typically thought of as resources that are inferred from
an HDL description. These resources are inferred directly from an HDL operator
such as a "+" or a ">" sign. Design Compiler uses the DesignWareTM mechanism
to infer these resources, perform resource sharing, and select the correct
architecture.
Resources can also be "user-defined resources." As with regular resources,
determining how many user-defined resources are needed in a design is an important
consideration. The DesignWare mechanism does not perform resource sharing on
user-defined resources. An example will help illustrate this concept.
If a design has an internal signal with a high fanout count, this critical
signal will probably be on the critical path. The logic that creates this critical
signal is the user-defined resource. An example of this type of situation could
be an error detector that is used extensively in a section of logic. The following
piece of a VHDL (and Verilog-) code fragment is typical of such situations:
VHDL Verilog
process (INTERRUPT) always @ (INTERRUPT)
if ..... if ...
ERROR <= 1; ERROR <= 1;
.... ...
process (DECODE, ERROR... always @ (DECODE or ERROR ..
case (DECODE) case (DECODE)
when COND1 => COND1 : if (~ERROR) ..
if ERROR = '0' ... COND3 : if (~ERROR) ...
when COND3 => COND4 : if (~ERROR) ...
if ERROR = '0' then COND5 : if (~ERROR) ...
when COND4 => ...
if ERROR = '0' then
when COND5 =>
if ERROR = '0' then
....
process (MORE, ERROR... always @ (MORE or ERROR ...
if (ERROR .... if (ERROR ..
This portion of a design has error detection circuitry, a decoder, and some
additional combinational logic. The first process checks the interrupt bus
and eventually determines whether or not an error occurred. This ERROR output
signal is a critical signal in the design because it fans out to a lot of places.
Within this code fragment, the error condition fans out to a decoder. Under
most but not all conditions of the case statement, the ERROR signal is checked
to be inactive. This ERROR signal is also used in the last process. Figure
8 is a representation of this type of design.
The user-defined resource is the left-most logic cloud. This user-defined resource
creates the ERROR signal that fans out to several combinational logic clouds.
This user-defined resource is important because it creates the critical ERROR
signal.
When user-defined resources are created, the user-defined resource and the
combinational logic it drives should be in the same module of hierarchy.
The question a designer needs to ask is "How many of my user-defined resources
do I need in the design?" Having the designer-defined resource in a single
module with the combinational logic it drives allows the designer to ask that
question by letting him or her experiment with the number of resources without
changing port boundaries.
If the loading on ERROR is very heavy, the best solution may require duplication
of the logic creating the ERROR signal. The designer may effectively create
two ERROR signals and two designer-defined resources as shown in Figure 9.
The designer-defined resources are duplicated by effectively creating two ERROR
signals within the HDL description as shown in the following code fragments:
VHDL Verilog
process (INTERRUPT) always @ (INTERRUPT)
if ..... if ...
ERROR1 <= 1; ERROR1 <= 1;
.... ...
process (INTERRUPT) always @ (INTERRUPT)
if ..... if ...
ERROR2 <= 1; ERROR2 <= 1;
...
process (DECODE, ERROR1.. always @ (DECODE or ERROR1 .
case (DECODE) case (DECODE)
... ...
process (MORE, ERROR2.. always @ (MORE or ERROR2 ...
if (ERROR2 .... if (ERROR2 ..
The first process that created the ERROR signal now creates an ERROR1 signal;
a second process creates the same logic and produces an ERROR2 signal. The
ERROR1 signal is used for the decoder, and the ERROR2 signal is used in the
other process. The number of ERROR signals required is unknown and requires
designer experimentation. This experimentation with duplication is more easily
handled within one module because extra pins and ports do not need to be created.
A secondary reason for this partitioning recommendation is that detailed time
budgets are not required for the critical ERROR signal.
Example of Poor Partitioning
Figure 10 is an example of poor partitioning. In this example, the
designer-defined resource is in a separate module.
In this example, four modules exist in this hierarchy. One module for each
of the combinational clouds and their corresponding registers, and one module
for the user-defined resource. With this type of partitioning, the designer
is forced to very carefully time budget the ERROR signal, and altering the
number of effective ERROR signals (or duplicating the designer#defined resource)
would actually change the port connections of the module boundaries.
Relaxation of the Recommendation
Flattening is the process of reducing the combinational logic into a two-level
representation. Most designs should not be flattened (e.g., datapath designs
or designs with good structure). Control logic often benefits from flattening.
If the module containing the user-defined resource and the logic it drives
should be flattened, the duplication would occur implicitly during the flattening
stage. In the example above, the critical signal was an error signal. Typically,
error-detection circuitry is largely made up of exclusive-OR trees that cannot
be flattened.
The designer can also relax this recommendation when the number of user-defined
resources is easily determined.
Recommendation: Separate Modules with Different Goals
To achieve the best synthesis results, design portions with different goals
should be isolated into their own level of hierarchy. The optimization algorithms
work with speed as the highest priority goal. To achieve the most area-efficient
design, the designer may wish to remove any speed constraints, apply a max_area
constraint, and perhaps turn on Boolean structuring. In order to apply this
compile strategy to a particular portion of a design, the designer needs to
isolate these non-speed critical sections of the design as shown in Figure
11.
In this figure, a section of the design off the critical path is separated
from the section on the critical path. To produce an area-efficient module
of the design on the left, the designer can remove any speed constraints, apply
only a max_area constraint, and turn on Boolean structuring to the left module.
Example of Poor Partitioning
Figure 12 is an example of poor partitioning. In this example, both the critical
path and sections of logic significantly off the critical path are merged into
one level of hierarchy.
Within Design Compiler, most constraints occur at the module level. The designer
cannot finely direct the optimization techniques at the individual-gate level.
In Figure 12, the designer may wish to try special optimization techniques
on the section of logic off the critical path in order to save area. Since
the logic is merged in with the critical path, the designer cannot try these
techniques.
Relaxation of the Recommendation
The designer may choose to ignore this recommendation for sections of the design
where the impact is insignificant. If only a small amount of logic can be optimized
for area only, the designer may choose to leave that logic in the same module
as the critical path and accept a slight increase in the design size.
Recommendation: Separate Modules with Different Strategies
The last recommendation for producing the best results involves separating
out different sections of logic that require different compile strategies.
This recommendation is very similar to the last recommendation. When a design
is pushing the limits of its technology, the designer may need to apply some
advanced compile strategies. Many of these compile strategies may only be applied
to an individual module.
Most constraints and attributes apply only to a module and not smaller
levels of granularity. Examples of these attributes include set_flatten,
and set_structure with the Boolean optimization switch. The techniques and
explanations of these optimization algorithms are beyond the scope of this
article.
Figure 13 shows a picture of a correctly partitioned design.
One module contains some error-detection circuitry. Error-detection circuitry
is a highly structured design that usually contains large exclusive-OR trees.
Because error-detection circuitry is highly structured, the module to the left
cannot be flattened. The module on the right contains random logic, which should
be flattened. In this example, the design is correctly partitioned. The random
logic, which should be flattened, is separated from the error detection circuitry,
which cannot or should not be flattened.
Example of Poor Partitioning
Figure 14 shows an example of poor partitioning.
This design contains a section of random logic, which should be flattened.
The rest of the design contains an adder and some muxing logic, which should
not be flattened. Regardless of whether or not these logic elements are off
the critical path, the designer should separate these types of logic in order
to apply the correct sense of flattening to the individual pieces.
Relaxation of the Recommendation
The designer may choose to ignore this recommendation for sections of the design
where the impact is insignificant. If the design easily meets timing constraints
and plenty of room exists on the die, the designer may choose to add a few
extra gates to simplify partitioning.
Goal: Speed Up The Compile Process
A second goal of the partitioning recommendations is to reduce the time required
for the compile step. Three partitioning recommendations can help with reducing
compile time:
1. Eliminate glue logic
2. Maintain a reasonable gate size
3. Isolate point-to-point exceptions in the same module
These three recommendations will be discussed in detail in the next sections.
Recommendation: Eliminate Glue Logic
A design hierarchy should ideally only contain gates at the leaf levels of
the hierarchy tree. Figure 15 is a graphic example of this recommendation.
This hierarchical design contains four modules. No actual gates exist at the
hierarchical level. This recommendation eliminates the extra CPU time necessary
to compile small amounts of glue logic.
Another motivation behind removing glue logic is that script development is
significantly easier. If gates only exist at leaf-level cells, an automated
script mechanism only needs to compile and characterize leaf-level cells.
Example of Poor Partitioning
Figure 16 contains a poor partitioning implementation that contains glue logic
at the top level.
The design in Figure 16 contains three modules and a small AND gate. The AND
gate is the glue logic. To compile this AND gate will take a considerable
amount of time because the lower-level designs are part of the hierarchy
and need to be in memory. To reduce the run time, the AND gate should be
grouped in with the logic it drives or into its own level of hierarchy
with the group command using the logic option.
Relaxation of the Recommendation
Many designers incorporate glue logic in their hierarchy. Typically, in the
middle of a design project, removing existing glue logic may not be worth the
effort of rearranging the hierarchy. Also, if the design is compiled with a
hierarchical compile strategy, the glue logic is not much of an issue. If the
entire design in Figure 16 is compiled using hierarchical compile, the extra
AND gate between the partitions will not slow down the compile process. Also,
if the design hierarchy is removed before or during compile, the glue logic
is not an issue.
Recommendation: Gate Size
One of the most commonly asked questions is "How many gates should I have in
a level of hierarchy?" This question is difficult to answer. Gate size is
really only a secondary consideration compared to the rest of the partitioning
recommendations. If the designer follows the general partitioning recommendations,
most designs fall into a reasonable gate size. The general recommendation is
to maintain a gate size between 250 and 2,000 gates per module. Larger modules
may be acceptable for designs with sufficient CPU power and memory.
Figure 17 illustrates a design that is well partitioned. The design modules
are a reasonable size. If design modules are too small, the designer may
restricting the optimization algorithms with artificial boundaries.
If design modules are too big, the compile run times can be prohibitive for
quick iterations. Also if ECOs are required, the changes will likely occur
only within one design hierarchy. After recompiling one module to accommodate
an ECO change, the internal names are usually changed. In order to minimize
the impact of name changes during an ECO process, smaller modules are recommended.
Example of Poor Partitioning
Figure 18 contains a poorly partitioned design. Ten gates is too small of a
design. Very likely, the block on the left can be merged with other logic.
Using a very small number of gates can severely limit optimization. The block
on the right is unnecessarily large. Design Compiler run times become prohibitive
at such a large gate size on most machines.
Relaxation of the Recommendation
Gate size is a secondary recommendation. If for functional reasons or other
partitioning reasons, the design does not fall in the 250 - 2,000 gate range,
the designer should not be concerned. The primary goal of having a gate size
recommendation is to give the designer a ballpark idea of the size of modules
to design.
Recommendation: Point-to-Point Exceptions Within the Same Module
If point-to-point exceptions exist within a design hierarchy, the designer
should keep those exceptions within a module as shown in Figure 19.
In the design in Figure 19, a false path point-to-point exception exists from
the D_reg to the S_reg. The partitioning recommendation is that whenever a
point-to-point exception exists, the point-to-point exception needs to be
completely contained within a module of hierarchy. Point-to-point exceptions
occur when any command includes both the -to and the -from option. Another
example is the set_multicycle_path command. If a multicycle path occurs only
between particular sets of registers, these registers and the logic containing
these paths should all be within one level of hierarchy.
Point-to-point exceptions can slow down compile considerably. By containing
the point-to-point exception within one module, the compile run-time impact
is minimized. Another reason behind the partitioning recommendation is that
the characterize command only has limited support for point-to-point exceptions.
These details of these characterize limitations are beyond the scope of this
document and are described in the 3.1 Design Compiler Reference Manual.
Example of Poor Partitioning
Figure 20 contains the identical design as Figure 19 but with an additional
piece of hierarchy.
In Figure 20, the middle combinational logic cloud is in a separate level of
hierarchy called bottom. Because a point-to-point exception of a false path
crosses the module bottom, characterize cannot accurately represent all the
paths through pin B. If the false path is the longest path through B, the
characterization will incorrectly set all paths as false from input pin B.
Relaxation of the Recommendation
The recommendation can be relaxed if the design in Figure 20 is compiled using
a hierarchical compile strategy from the top level of the design. With hierarchical
compile, the recommendation is followed by wholly containing the single-cycle
exception in the hierarchy being compiled.
Goal: Simplify Scripts and Constraint Files
Following strict partitioning guidelines can greatly simplify the synthesis
process. This set of rules does not improve the quality of results but will
enormously simplify the set of scripts required for synthesis.
Four recommendations can help the designer simplify the synthesis process:
1. Register all outputs
2. Create core logic, pad ring, and test hierarchy
3. Separate negative-edge and positive-edge flip-flops
4. Isolate state machine logic
The following sections explain each of these four recommendations.
Recommendation: Register All Outputs
To simplify the constraint and scripting process, all outputs of a block should
be registered as shown in Figure 21.
This partitioning recommendation allows the designer to constrain each block
very easily. The drive strength on the inputs to an individual block is predictable,
and is equal to the drive strength of the average flip-flop. The input delays
from the previous block are very predictable, and are equal to the path through
the flip-flop. Because no combinational-only paths exist with all outputs registered,
the designer does not have to time budget her or his design or use the
set_output_delay command! Since one clock cycle occurs within each block, the
constraints are very simple, and are identical for each module.
Outside of the synthesis arena, this partitioning approach supports a coding
style that can speed up simulation. With all of the outputs registered, a block
can be described with only edge-triggered processes. The sensitivity list contains
only the clock and perhaps a reset pin. With a limited sensitivity list, this
approach speeds up simulation by having the process trigger only once per clock
cycle.
Example of Poor Partitioning
Figure 22 contains an output signal, which is fed directly off of combinational
logic. It is an example of poor partitioning because the output signal Q is
not registered. This signal needs to be time budgeted in detail in order to
compile each leaf module. This time budgeting includes the amount of time used
in each module, the estimated drive strength, and the estimated load. Although
this style of partitioning is very common and often follows functional boundaries,
cross-module combinational paths require a lot more work when developing synthesis
scripts.
Relaxation of the Recommendation
In a few cases, the designer can relax this recommendation. First, if a detailed
specification exists down to the module level, including a thorough time budget,
this recommendation can be relaxed. Also, if the designer wishes to preserve
functional hierarchy, s/he can make the tradeoff between preserving functional
hierarchy and developing a more rigid time budget for those signals that cross
through block boundaries in combinational logic.
In addition, if a hierarchical compile technique is an acceptable alternative,
the designer can compile the two blocks like those in Figure 23 using hierarchical
compile.
Hierarchical compile automatically takes care of intermodule loads, drives,
and time budgets. If the hierarchical partition conforms to the registered
output rule, the partitioning rule does not need to be applied to lower levels
within that hierarchical partition. In the circuit shown in Figure 23, since
the two blocks are contained within a higher-level block, the higher-level
block conforms to the registered output rule. Therefore, if the designer chooses
to compile this section of logic using a hierarchical approach, the two sub-blocks
do not need to conform to the registered output rule.
Figure 24 is another case where the designer may wish to relax the registered
output recommendation. In Figure 24, the two blocks contain datapath elements.
The hierarchy is divided before the last mux. Since delays and time budgets
are much more predictable in datapath logic, the designer may wish to relax
the registered output requirement for datapath sections in the logic.
The third case where the recommendation can be relaxed occurs with latched-based
designs. Unless time borrowing is restricted, latched outputs do not fix timing
budgets.
Recommendation: Top Level Partitioning-I/O Pad Isolation
Figure 25 illustrates the partitioning recommendation for the top of an ASIC:
The top level of a design should contain an I/O pad ring. Within the top level
of hierarchy should be a middle level of hierarchy with any JTAG modules, any
clock generation circuitry, and the core logic. The middle level of hierarchy
exists to allow the flexibility to instantiate any I/O pads. The clock generation
circuitry is isolated from the rest of the design because it typically must
be handcrafted and carefully simulated.
The pad ring is installed using the insert_pads command, and by specifying
chip-level attributes on the actual pins of the ASIC. Also, for bidirectional
and three-state pads, the designer needs to keep the three-state logic at the
top-level of the chip so that insert_pads can find the correct I/O pad. The
designer can create the JTAG logic automatically with the insert_jtag command.
The core logic is isolated into a separate level of hierarchy.
This hierarchy arrangement is not a requirement but allows for an easier integration
and management between test logic, pads, and the functional core. The designer
will insert the scan chain into only the core logic. Also, most of the synthesis
process will occur completely within the core.
Example of Poor Partitioning
Figure 26 contains a design with functional logic and clock generation circuitry
combined.
Figure 26 contains a functional flip-flop, B, on the right. Flip-flop B has
a clock that is derived from the flip-flop A. Design Compiler does not allow
you to constrain clocks on flip-flop outputs until the design is first mapped
into gates. So until flip-flop A is compiled, flip-flop B cannot be constrained.
A better approach is to isolate clock generation circuitry from the functional
logic. The design is then effectively divided as shown in Figure 27.
In Figure 27, the clock generation circuitry is isolated from the functional
logic. The clock generated by flip-flop A is available as a port to the functional
logic. The result of this partitioning is that flip-flop B can be constrained
directly from the HDL source code.
Recommendation: Separate Negative-Edge and Positive-Edge Flip-Flops
Test Compiler hooks up the scan chain for registers connected to the same clock
line in alphabetical order. If a module contains both positive-edge flip-flops
and negative-edge flip-flops, Test Compiler will likely mix up the positive-
and negative-edge flip-flops in the scan chain in a seemingly random manner.
In order to have better control of the scan chain order, the designer should
separate these flip-flops into different modules, as shown in Figure 28.
By following this recommendation, Test Compiler will insert the scan chains
accurately within each module, and the designer can control the scan-chain
ordering at the hierarchical levels.
Relaxation of the Recommendation
Design Compiler customers who are not using Test Compiler do not need to conform
to this recommendation.
Recommendation: Isolate State Machines
If a design contains a state machine that might benefit from the state machine
compiler, the state machine should be isolated from the rest of the design
as shown in Figure 29.
The graphic picture on the left represents a state machine. The reasoning behind
this recommendation is that the state machine extraction and optimization process
is simpler if the module contains only a state machine.
Relaxation of the Recommendation
If the state machine will not be optimized using the state machine compiler,
the state machine does not need to be in its own level of hierarchy.
Commands for Hierarchy Rearranging
The best approach for partitioning is to plan the partitioning of the design
before writing HDL code. If HDL code is already developed, Design Compiler
contains the two commands, group and ungroup, that allow you to rearrange the
hierarchy. Also, as previously mentioned, the designer can execute a hierarchical
compile methodology in some cases in order to effectively remove the partitioning.
This hierarchical strategy works well for simplifying scripts but not for improving
of optimization results. If artificial and incorrect barriers exist in random
logic paths, the best approach is probably to rearrange the hierarchy either
within the HDL code or using the ungroup and group commands.
If the HDL code is already written, the designer may wish to experiment with
the group and ungroup commands before modifying partitions in the source code.
By using the group and ungroup commands, the designer can determine the impact
of altering partitions before committing to source code changes.
Ungroup
The ungroup command removes hierarchy. Figure 30 shows a design before and
after using the ungroup command.
On the left is a design top with three sub-designs. Each sub-design contains
only one combinational gate. With such small partitions, the optimization process
is severely restricted. The ungroup command can remove this hierarchy. The
ungroup command with the -all option removes all of the hierarchy at the current
design level. At the right, the design is ungrouped and all cells are at the
top level of the hierarchy.
The ungroup command can ungroup a list of cells in a current design. Also,
the designer can choose to ungroup not only all of the cells at the current
level of hierarchy but the entire design hierarchy by using the flat option.
Group
The group command is the reverse of the ungroup command as it creates new levels
of hierarchy. Figure 31 illustrates the operation of the group command.
The group command allows you to create new levels of hierarchy from the objects
at the current level. At the left, the design top contains three cells. Two
of those cells are U1 and U2. Using the group command to create a sub-design
called new, the designer now has a level of hierarchy below top. Top now contains
the mux and a sub-design called new, which contains the inverter and the NAND
gate.
The group command can create a new level of hierarchy from a list of cells,
all cells except a list of cells, all combinational logic, logic read in as
a PLA, logic read in as a state machine, a block from an HDL, or a set of bused
gates read in from an HDL. Grouping HDL blocks is very powerful. This option
allows you to rearrange your HDL code immediately after reading in the source
code.
Below are examples of VHDL and Verilog source code:
VHDL Verilog
myprocess : process always @ (posedge clk)
: myblock
myblock : block
Individual HDL blocks can be grouped with the -hdl_block option of the group
command. To group either a Verilog always block, a VHDL process, or a VHDL
block, the designer can group using the name of the block. In Figure 26, the
name of the block is "myblock" for the VHDL block or Verilog always block,
and "myprocess" for the VHDL process.
Summary
This methodology note discussed various partitioning guidelines for synthesis.
These recommendations are not rules. Designers can make their own choices on
where they wish to partition for either increased quality of results, faster
run times, or simplified scripts. Instead of making drastic changes in the
HDL source, designers can experiment with various partitions using the group
and ungroup commands, and analyze the impact of possible partitioning changes.
----- End Included Message -----
Trademarks/Copyright ©1997 Synopsys, Inc. All Rights Reserved. Last Modified: Feb 12, 1997