METH v1.0a:current Methodology Note: ASIC Design Partitioning

METH v1.0a:current Methodology Note: ASIC Design Partitioning
NOTE-019401 1994:05:19 39K Methodology A:0761 R:0761


ASIC Design Partitioning   

Karen Vahtra, Synopsys Inc.

Methdology Notes, January 1994, Volume 2, Number 2

Partitioning within an ASIC at the correct level of granularity can greatly 
help the synthesis process. This document discusses partitioning from the 
standpoint of the ASIC. Areas discussed include how many partitions or modules 
should exist in an ASIC, where in the logic these boundaries should occur,
and what type of logic should be kept together in one module. This partitioning 
is largely not a functional partitioning but one that aids the synthesis process.

This document will discuss a number of partitioning recommendations. Different 
goals lie behind the recommendations, and these goals are discussed first, 
followed by the detailed recommendations. The partitioning recommendations 
are not rules. With a complete understanding of the goals and results of the 
recommendations, designers can make their own decisions on when it is appropriate 
or not appropriate to follow each individual recommendation. Examples demonstrate 
the difference between effective partitioning and poor partitioning. At the 
end of some of the partitioning recommendations there are guidelines describing 
where the recommendation can be relaxed for particular design styles.

Partitioning is much easier to fix early in the design cycle than later. 
Partitioning can be changed manually by altering the interfaces and code 
within modules or by using the commands in Design CompilerTM. The last 
section discusses commands available in Design Compiler that can alter 
partitioning.

Goals Behind the Partitioning Recommendations

Three main goals are behind the partitioning recommendations:

1. Produce the best synthesis results

2. Speed up the compile process

3. Simplify the constraint and script files for synthesis

The most significant goal behind the partitioning recommendations is to produce 
better synthesis results. If logic is correctly arranged in modules and correctly 
arranged in the hierarchy, good synthesis results are significantly easier 
to obtain. A few partitioning recommendations address the goal of shortening 
the compile step. A number of recommendations can really help in creating a 
hierarchy that simplifies script automation. If strict partitioning recommendations 
are followed, the designer only needs to create a few files for scripts and 
constraints for the entire design.

Goal: Achieve the Best Synthesis Results

The strongest reason for altering partitioning is to get the best synthesis 
results possible. Good partitioning can significantly help in meeting area 
and performance constraints.

Five recommendations can help the designer to meet her or his constraints:

1. Keep related combinational logic together in the same module

2. Merge any sharable resources into the same module

3. Merge "user-defined resources" and the logic they drive into the same module

4. Separate out logic that has different design goals into separate blocks

5. Separate out logic that has different compile strategies into separate blocks

The following sections explain each of these five recommendations.

Recommendation: Keep Related Combinational Logic Together in the Same Module

The most important recommendation is to keep related combinational logic together 
in the same module. Inter-module partitions can restrict logic optimization. 
An example is shown in Figure 1.

The dashed box around the diagram indicates the module boundary. All of the 
objects within the dashed line are within one module. The two symbols on the 
right indicate two register banks. The three free-form clouds represent 
combinational logic. Signals are represented as thin wires, and buses are 
represented with thicker lines as shown in this diagram.

In this example, three combinational logic clouds are part of the critical 
path or close to the critical path. Design Compiler has more flexibility in 
optimizing a design if the related logic is in the same module and no artificial 
boundaries exist. Hierarchical boundaries prevent any combining of related 
logic. Without artificial boundaries as in Figure 1, Design Compiler can combine 
related functions in the combinational clouds.

Example of Poor Partitioning

Figure 2 is an example of poor partitioning.

In Figure 2 the critical path is divided among two modules. This type of 
partitioning leaves less flexibility for creating the best design. Design 
Compiler cannot move logic across hierarchical boundaries during default 
compile operations. In the last section of this methodology note, entitled 
"Commands for Hierarchy Rearranging," a discussion covers how to remove these 
restrictions using the ungroup command.

Also, the designer is forced to time budget the interconnect signals between 
the modules very carefully. Time budgeting refers to allocating delays to paths 
within a design among different partitions. To produce good results using the 
design shown in Figure 2, the designer needs to time budget the interconnect 
between these modules as shown in Figure 3.

In Figure 3, the designer needs to manually generate the time budget for this 
design with a ten-nanosecond period. In Figure 3, six nanoseconds are used 
in the left module, and four nanoseconds of the ten-nanosecond clock period 
are used in the right module. This four-nanosecond period is indicated by the 
set_output_delay on the output port of the left module, and the six-nanosecond 
period is indicated by the set_input_delay on the input port of the right module. 
The designer also needs to predict the gates that both drive and load this 
net. After estimating the type of driver in the left module, the designer needs 
to specify this value with the set_drive command on the right module's input 
port. After predicting the amount of load this signal needs to drive, the designer 
specifies this value with the set_load on the output port of the left module.

Relaxation of the Recommendation

This recommendation is less important in two cases. The first case occurs for 
datapath designs where detailed time budgets are available for the intermediate 
signals. The example in Figure 4 is a case where detailed time budgets are 
relatively simple to create.

In this example, three modules exist. The left module is a datapath module 
with two levels of muxing logic, represented by the quadrangles. Since the 
expected logic is very predictable, the time budgets for the intermediate signal 
can be easily derived from the vendor's databook before synthesis. The time 
through the mux logic is relatively predictable, as is the drive strength of 
the mux in the last stage of logic. The only attribute that is less predictable 
for time budgeting is the amount of load on the intermediate net.

A secondary reason this design does not need to conform to the partitioning 
recommendation is that the logic in the combinational elements probably does 
not need to be merged with the mux logic. If the designer knows a series of 
muxes is the best implementation on the left module, Design Compiler will not 
produce better results by merging the datapath muxes with the random control 
logic it feeds.

If these three modules are compiled together with a hierarchical compile approach, 
the designer may not wish to conform to this partitioning recommendation. The 
various compile approaches are discussed in the Methodology Notes, Volume 1 
Number 3 in the article "Compile Methodologies for Hierarchical Designs."

In Figure 5, all three modules are enclosed in their own hierarchy. If 
hierarchical compile is used to compile all of these modules at once, the 
partitioning recommendation can be relaxed. By compiling these modules using 
hierarchical compile, intermediate time budgets are automatically calculated 
by Design Compiler. The hierarchical compile approach still restricts logic 
optimization across boundaries, and it is not appropriate if the left moduleis random logic, as was the case in 
Figure 2.

Recommendation: Merge "Resources" into the Same Module

A resource is most commonly thought of as a synthetic operator that can be 
directly inferred from an HDL, as shown in the following code fragment:
 
	VHDL                    	Verilog

        if CTL = '1' then if (CTL)

                Z <= A + B;         	Z <= A + B;

        else                            	else

                Z <= C + D;        	Z <= C + D;

        end if;

            

           
Two adder resources are created in this example. One adder adds the signals 
A and B together; the second adder adds the signals C and D together. Design 
Compiler will chose whether to share the resources based upon the constraints. 
If only an area constraint exists, Design Compiler will likely share the adders. 
If performance is a consideration, the adders may or may not be merged.

For Design Compiler to consider resource sharing, all relevant resources need 
to be within one level of hierarchy. If the resources are not within one level 
of hierarchy, Design Compiler cannot make tradeoffs to determine whether or 
not the resources should be shared. Figure 6 shows of a possible representation 
of hierarchy that corresponds to this recommendation.

The two circles with the plus sign represent the adders. The quadrangle is 
the mux that selects the correct sum depending upon the CTL signal. 

Figure 6 is an example of good partitioning because the two adders are within 
the same level of hierarchy. This partitioning allows Design Compiler full 
flexibility when choosing to share or not share the adders.

Example of Poor Partitioning

Figure 7 is an example of poor partitioning. In this example, resources that 
can be shared are separated by hierarchical boundaries.

In this example, four modules exist in this hierarchy. One module contains 
only a subtractor, another module contains only an adder, another module contains 
another adder, and the third module contains muxing logic and the register 
bank. In this example, Design Compiler cannot combine the adders or the subtractor 
because the resource-sharing algorithms do not work across hierarchical boundaries. 
A better partitioning scheme would keep all of these elements in one level 
of hierarchy to give Design Compiler full freedom. The adders could be merged 
with the subtractor creating an adder/subtractor. 

Also, time budgeting is required between the different modules. In this case, 
the time budget is probably predictable given the datapath nature of the design.

Recommendation: Keep User-Defined Resources with the Logic They Drive

The term resources is typically thought of as resources that are inferred from 
an HDL description. These resources are inferred directly from an HDL operator 
such as a "+" or a ">" sign. Design Compiler uses the DesignWareTM mechanism 
to infer these resources, perform resource sharing, and select the correct 
architecture.

Resources can also be "user-defined resources." As with regular resources, 
determining how many user-defined resources are needed in a design is an important 
consideration. The DesignWare mechanism does not perform resource sharing on 
user-defined resources. An example will help illustrate this concept.

If a design has an internal signal with a high fanout count, this critical 
signal will probably be on the critical path. The logic that creates this critical 
signal is the user-defined resource. An example of this type of situation could 
be an error detector that is used extensively in a section of logic. The following 
piece of a VHDL (and Verilog-) code fragment is typical of such situations:

         VHDL                         Verilog


process (INTERRUPT)             always @ (INTERRUPT)

  if .....                                if ...

        ERROR <= 1;                     ERROR <= 1;

  ....                                    ...

process (DECODE, ERROR...   always @ (DECODE or ERROR ..

  case (DECODE)                           case (DECODE)

        when COND1 =>                   COND1 : if (~ERROR) ..

          if ERROR = '0' ...        COND3 : if (~ERROR) ...

        when COND3 =>                   COND4 : if (~ERROR) ...

          if ERROR = '0' then           COND5 : if (~ERROR) ...

        when COND4 =>                   ...

          if ERROR = '0' then

        when COND5 =>

          if ERROR = '0' then

        ....

process (MORE, ERROR...             always @ (MORE or ERROR ...

  if (ERROR ....                                  if (ERROR ..

            

This portion of a design has error detection circuitry, a decoder, and some 
additional combinational logic. The first process checks the interrupt bus 
and eventually determines whether or not an error occurred. This ERROR output 
signal is a critical signal in the design because it fans out to a lot of places. 
Within this code fragment, the error condition fans out to a decoder. Under 
most but not all conditions of the case statement, the ERROR signal is checked 
to be inactive. This ERROR signal is also used in the last process. Figure 
8 is a representation of this type of design.

The user-defined resource is the left-most logic cloud. This user-defined resource 
creates the ERROR signal that fans out to several combinational logic clouds. 
This user-defined resource is important because it creates the critical ERROR 
signal.

When user-defined resources are created, the user-defined resource and the 
combinational logic it drives should be in the same module of hierarchy.

The question a designer needs to ask is "How many of my user-defined resources 
do I need in the design?"  Having the designer-defined resource in a single 
module with the combinational logic it drives allows the designer to ask that 
question by letting him or her experiment with the number of resources without 
changing port boundaries.

If the loading on ERROR is very heavy, the best solution may require duplication 
of the logic creating the ERROR signal. The designer may effectively create 
two ERROR signals and two designer-defined resources as shown in Figure 9.

The designer-defined resources are duplicated by effectively creating two ERROR 
signals within the HDL description as shown in the following code fragments:

            VHDL                    Verilog

        

process (INTERRUPT)             always @ (INTERRUPT)

  if .....                                if ...

        ERROR1 <= 1;                            ERROR1 <= 1;

....                                              ...

process (INTERRUPT)             always @ (INTERRUPT)

  if .....                                if ...

        ERROR2 <= 1;                            ERROR2 <= 1;

...

process (DECODE, ERROR1..   always @ (DECODE or ERROR1 .

  case (DECODE)                           case (DECODE)

  ...                                     ...

process (MORE, ERROR2..             always @ (MORE or ERROR2 ...

  if (ERROR2 ....         if (ERROR2 ..

 

The first process that created the ERROR signal now creates an ERROR1 signal; 
a second process creates the same logic and produces an ERROR2 signal. The 
ERROR1 signal is used for the decoder, and the ERROR2 signal is used in the 
other process. The number of ERROR signals required is unknown and requires 
designer experimentation. This experimentation with duplication is more easily 
handled within one module because extra pins and ports do not need to be created.

A secondary reason for this partitioning recommendation is that detailed time 
budgets are not required for the critical ERROR signal.

Example of Poor Partitioning

Figure 10 is an example of poor partitioning. In this example, the 
designer-defined resource is in a separate module.

In this example, four modules exist in this hierarchy. One module for each 
of the combinational clouds and their corresponding registers, and one module 
for the user-defined resource. With this type of partitioning, the designer 
is forced to very carefully time budget the ERROR signal, and altering the 
number of effective ERROR signals (or duplicating the designer#defined resource) 
would actually change the port connections of the module boundaries.

Relaxation of the Recommendation

Flattening is the process of reducing the combinational logic into a two-level 
representation.  Most designs should not be flattened (e.g., datapath designs 
or designs with good structure). Control logic often benefits from flattening. 
If the module containing the user-defined resource and the logic it drives 
should be flattened, the duplication would occur implicitly during the flattening 
stage. In the example above, the critical signal was an error signal. Typically, 
error-detection circuitry is largely made up of exclusive-OR trees that cannot 
be flattened.

The designer can also relax this recommendation when the number of user-defined 
resources is easily determined.

Recommendation: Separate Modules with Different Goals

To achieve the best synthesis results, design portions with different goals 
should be isolated into their own level of hierarchy. The optimization algorithms 
work with speed as the highest priority goal. To achieve the most area-efficient 
design, the designer may wish to remove any speed constraints, apply a max_area 
constraint, and perhaps turn on Boolean structuring. In order to apply this 
compile strategy to a particular portion of a design, the designer needs to 
isolate these non-speed critical sections of the design as shown in Figure 
11.

In this figure, a section of the design off the critical path is separated 
from the section on the critical path. To produce an area-efficient module 
of the design on the left, the designer can remove any speed constraints, apply 
only a max_area constraint, and turn on Boolean structuring to the left module.

Example of Poor Partitioning

Figure 12 is an example of poor partitioning. In this example, both the critical 
path and sections of logic significantly off the critical path are merged into 
one level of hierarchy.

Within Design Compiler, most constraints occur at the module level. The designer 
cannot finely direct the optimization techniques at the individual-gate level. 
In Figure 12, the designer may wish to try special optimization techniques 
on the section of logic off the critical path in order to save area. Since 
the logic is merged in with the critical path, the designer cannot try these 
techniques.

Relaxation of the Recommendation

The designer may choose to ignore this recommendation for sections of the design 
where the impact is insignificant. If only a small amount of logic can be optimized 
for area only, the designer may choose to leave that logic in the same module 
as the critical path and accept a slight increase in the design size.

Recommendation: Separate Modules with Different Strategies

The last recommendation for producing the best results involves separating 
out different sections of logic that require different compile strategies. 
This recommendation is very similar to the last recommendation. When a design 
is pushing the limits of its technology, the designer may need to apply some 
advanced compile strategies. Many of these compile strategies may only be applied 
to an individual module.

Most constraints and attributes apply only to a module and not smaller
levels of granularity. Examples of these attributes include set_flatten, 
and set_structure with the Boolean optimization switch. The techniques and 
explanations of these optimization algorithms are beyond the scope of this 
article.

Figure 13 shows a picture of a correctly partitioned design.

One module contains some error-detection circuitry. Error-detection circuitry 
is a highly structured design that usually contains large exclusive-OR trees. 
Because error-detection circuitry is highly structured, the module to the left 
cannot be flattened. The module on the right contains random logic, which should 
be flattened. In this example, the design is correctly partitioned. The random 
logic, which should be flattened, is separated from the error detection circuitry, 
which cannot or should not be flattened. 

Example of Poor Partitioning

Figure 14 shows an example of poor partitioning.

This design contains a section of random logic, which should be flattened. 
The rest of the design contains an adder and some muxing logic, which should 
not be flattened. Regardless of whether or not these logic elements are off 
the critical path, the designer should separate these types of logic in order 
to apply the correct sense of flattening to the individual pieces.

Relaxation of the Recommendation

The designer may choose to ignore this recommendation for sections of the design 
where the impact is insignificant. If the design easily meets timing constraints 
and plenty of room exists on the die, the designer may choose to add a few 
extra gates to simplify partitioning.

Goal: Speed Up The Compile Process

A second goal of the partitioning recommendations is to reduce the time required 
for the compile step. Three partitioning recommendations can help with reducing 
compile time:

1. Eliminate glue logic

2. Maintain a reasonable gate size

3. Isolate point-to-point exceptions in the same module

These three recommendations will be discussed in detail in the next sections.

Recommendation: Eliminate Glue Logic

A design hierarchy should ideally only contain gates at the leaf levels of 
the hierarchy tree. Figure 15 is a graphic example of this recommendation.

This hierarchical design contains four modules. No actual gates exist at the 
hierarchical level. This recommendation eliminates the extra CPU time necessary 
to compile small amounts of glue logic.

Another motivation behind removing glue logic is that script development is 
significantly easier. If gates only exist at leaf-level cells, an automated 
script mechanism only needs to compile and characterize leaf-level cells.

Example of Poor Partitioning

Figure 16 contains a poor partitioning implementation that contains glue logic 
at the top level.

The design in Figure 16 contains three modules and a small AND gate. The AND 
gate is the glue logic. To compile this AND gate will take a considerable 
amount of time because the lower-level designs are part of the hierarchy
and need to be in memory. To reduce the run time, the AND gate should be 
grouped in with the logic it drives or into its own level of hierarchy 
with the group command using the logic option.

Relaxation of the Recommendation

Many designers incorporate glue logic in their hierarchy. Typically, in the 
middle of a design project, removing existing glue logic may not be worth the 
effort of rearranging the hierarchy. Also, if the design is compiled with a 
hierarchical compile strategy, the glue logic is not much of an issue. If the 
entire design in Figure 16 is compiled using hierarchical compile, the extra 
AND gate between the partitions will not slow down the compile process. Also, 
if the design hierarchy is removed before or during compile, the glue logic 
is not an issue.

Recommendation: Gate Size

One of the most commonly asked questions is "How many gates should I have in 
a level of hierarchy?"  This question is difficult to answer. Gate size is 
really only a secondary consideration compared to the rest of the partitioning 
recommendations. If the designer follows the general partitioning recommendations, 
most designs fall into a reasonable gate size. The general recommendation is 
to maintain a gate size between 250 and 2,000 gates per module. Larger modules 
may be acceptable for designs with sufficient CPU power and memory.

Figure 17 illustrates a design that is well partitioned. The design modules 
are a reasonable size. If design modules are too small, the designer may 
restricting the optimization algorithms with artificial boundaries. 

If design modules are too big, the compile run times can be prohibitive for 
quick iterations. Also if ECOs are required, the changes will likely occur 
only within one design hierarchy. After recompiling one module to accommodate 
an ECO change, the internal names are usually changed. In order to minimize 
the impact of name changes during an ECO process, smaller modules are recommended.

Example of Poor Partitioning

Figure 18 contains a poorly partitioned design. Ten gates is too small of a 
design. Very likely, the block on the left can be merged with other logic. 
Using a very small number of gates can severely limit optimization. The block 
on the right is unnecessarily large. Design Compiler run times become prohibitive 
at such a large gate size on most machines.

Relaxation of the Recommendation

Gate size is a secondary recommendation. If for functional reasons or other 
partitioning reasons, the design does not fall in the 250 - 2,000 gate range, 
the designer should not be concerned. The primary goal of having a gate size 
recommendation is to give the designer a ballpark idea of the size of modules 
to design.

Recommendation:  Point-to-Point Exceptions Within the Same Module

If point-to-point exceptions exist within a design hierarchy, the designer 
should keep those exceptions within a module as shown in Figure 19.

In the design in Figure 19, a false path point-to-point exception exists from 
the D_reg to the S_reg. The partitioning recommendation is that whenever a 
point-to-point exception exists, the point-to-point exception needs to be 
completely contained within a module of hierarchy. Point-to-point exceptions 
occur when any command includes both the -to and the -from option. Another 
example is the set_multicycle_path command. If a multicycle path occurs only 
between particular sets of registers, these registers and the logic containing 
these paths should all be within one level of hierarchy.

Point-to-point exceptions can slow down compile considerably. By containing 
the point-to-point exception within one module, the compile run-time impact 
is minimized. Another reason behind the partitioning recommendation is that 
the characterize command only has limited support for point-to-point exceptions. 
These details of these characterize limitations are beyond the scope of this 
document and are described in the  3.1 Design Compiler Reference Manual.

Example of Poor Partitioning

Figure 20 contains the identical design as Figure 19 but with an additional 
piece of hierarchy.

In Figure 20, the middle combinational logic cloud is in a separate level of 
hierarchy called bottom. Because a point-to-point exception of a false path 
crosses the module bottom, characterize cannot accurately represent all the 
paths through pin B. If the false path is the longest path through B, the 
characterization will incorrectly set all paths as false from input pin B.

Relaxation of the Recommendation

The recommendation can be relaxed if the design in Figure 20 is compiled using 
a hierarchical compile strategy from the top level of the design. With hierarchical 
compile, the recommendation is followed by wholly containing the single-cycle 
exception in the hierarchy being compiled.

Goal: Simplify Scripts and Constraint Files

Following strict partitioning guidelines can greatly simplify the synthesis 
process. This set of rules does not improve the quality of results but will 
enormously simplify the set of scripts required for synthesis.

Four recommendations can help the designer simplify the synthesis process:

1. Register all outputs

2. Create core logic, pad ring, and test hierarchy

3. Separate negative-edge and positive-edge flip-flops

4. Isolate state machine logic

The following sections explain each of these four recommendations.

Recommendation: Register All Outputs

To simplify the constraint and scripting process, all outputs of a block should 
be registered as shown in Figure 21.

This partitioning recommendation allows the designer to constrain each block 
very easily. The drive strength on the inputs to an individual block is predictable, 
and is equal to the drive strength of the average flip-flop. The input delays 
from the previous block are very predictable, and are equal to the path through 
the flip-flop. Because no combinational-only paths exist with all outputs registered, 
the designer does not have to time budget her or his design or use the 
set_output_delay command!  Since one clock cycle occurs within each block, the 
constraints are very simple, and are identical for each module.

Outside of the synthesis arena, this partitioning approach supports a coding 
style that can speed up simulation. With all of the outputs registered, a block 
can be described with only edge-triggered processes. The sensitivity list contains 
only the clock and perhaps a reset pin. With a limited sensitivity list, this 
approach speeds up simulation by having the process trigger only once per clock 
cycle.

Example of Poor Partitioning

Figure 22 contains an output signal, which is fed directly off of combinational 
logic. It is an example of poor partitioning because the output signal Q is 
not registered. This signal needs to be time budgeted in detail in order to 
compile each leaf module. This time budgeting includes the amount of time used 
in each module, the estimated drive strength, and the estimated load. Although 
this style of partitioning is very common and often follows functional boundaries, 
cross-module combinational paths require a lot more work when developing synthesis 
scripts.

Relaxation of the Recommendation

In a few cases, the designer can relax this recommendation. First, if a detailed 
specification exists down to the module level, including a thorough time budget, 
this recommendation can be relaxed. Also, if the designer wishes to preserve 
functional hierarchy, s/he can make the tradeoff between preserving functional 
hierarchy and developing a more rigid time budget for those signals that cross 
through block boundaries in combinational logic. 

In addition, if a hierarchical compile technique is an acceptable alternative, 
the designer can compile the two blocks like those in Figure 23 using hierarchical 
compile.

Hierarchical compile automatically takes care of intermodule loads, drives, 
and time budgets. If the hierarchical partition conforms to the registered 
output rule, the partitioning rule does not need to be applied to lower levels 
within that hierarchical partition. In the circuit shown in Figure 23, since 
the two blocks are contained within a higher-level block, the higher-level 
block conforms to the registered output rule. Therefore, if the designer chooses 
to compile this section of logic using a hierarchical approach, the two sub-blocks 
do not need to conform to the registered output rule.

Figure 24 is another case where the designer may wish to relax the registered 
output recommendation. In Figure 24, the two blocks contain datapath elements. 
The hierarchy is divided before the last mux. Since delays and time budgets 
are much more predictable in datapath logic, the designer may wish to relax 
the registered output requirement for datapath sections in the logic.

The third case where the recommendation can be relaxed occurs with latched-based 
designs. Unless time borrowing is restricted, latched outputs do not fix timing 
budgets.

Recommendation: Top Level Partitioning-I/O Pad Isolation

Figure 25 illustrates the partitioning recommendation for the top of an ASIC:

The top level of a design should contain an I/O pad ring. Within the top level 
of hierarchy should be a middle level of hierarchy with any JTAG modules, any 
clock generation circuitry, and the core logic. The middle level of hierarchy 
exists to allow the flexibility to instantiate any I/O pads. The clock generation 
circuitry is isolated from the rest of the design because it typically must 
be handcrafted and carefully simulated.

The pad ring is installed using the insert_pads command, and by specifying 
chip-level attributes on the actual pins of the ASIC. Also, for bidirectional 
and three-state pads, the designer needs to keep the three-state logic at the 
top-level of the chip so that insert_pads can find the correct I/O pad. The 
designer can create the JTAG logic automatically with the insert_jtag command. 
The core logic is isolated into a separate level of hierarchy.

This hierarchy arrangement is not a requirement but allows for an easier integration 
and management between test logic, pads, and the functional core. The designer 
will insert the scan chain into only the core logic. Also, most of the synthesis 
process will occur completely within the core.

Example of Poor Partitioning

Figure 26 contains a design with functional logic and clock generation circuitry 
combined.

Figure 26 contains a functional flip-flop, B, on the right. Flip-flop B has 
a clock that is derived from the flip-flop A. Design Compiler does not allow 
you to constrain clocks on flip-flop outputs until the design is first mapped 
into gates. So until flip-flop A is compiled, flip-flop B cannot be constrained. 
A better approach is to isolate clock generation circuitry from the functional 
logic. The design is then effectively divided as shown in Figure 27.

In Figure 27, the clock generation circuitry is isolated from the functional 
logic. The clock generated by flip-flop A is available as a port to the functional 
logic. The result of this partitioning is that flip-flop B can be constrained 
directly from the HDL source code.

Recommendation: Separate Negative-Edge and Positive-Edge Flip-Flops

Test Compiler hooks up the scan chain for registers connected to the same clock 
line in alphabetical order. If a module contains both positive-edge flip-flops 
and negative-edge flip-flops, Test Compiler will likely mix up the positive- 
and negative-edge flip-flops in the scan chain in a seemingly random manner. 
In order to have better control of the scan chain order, the designer should 
separate these flip-flops into different modules, as shown in Figure 28.

By following this recommendation, Test Compiler will insert the scan chains 
accurately within each module, and the designer can control the scan-chain 
ordering at the hierarchical levels.

Relaxation of the Recommendation

Design Compiler customers who are not using Test Compiler do not need to conform 
to this recommendation.

Recommendation: Isolate State Machines

If a design contains a state machine that might benefit from the state machine 
compiler, the state machine should be isolated from the rest of the design 
as shown in Figure 29.

The graphic picture on the left represents a state machine. The reasoning behind 
this recommendation is that the state machine extraction and optimization process 
is simpler if the module contains only a state machine. 

Relaxation of the Recommendation

If the state machine will not be optimized using the state machine compiler, 
the state machine does not need to be in its own level of hierarchy.

Commands for Hierarchy Rearranging

The best approach for partitioning is to plan the partitioning of the design 
before writing HDL code. If HDL code is already developed, Design Compiler 
contains the two commands, group and ungroup, that allow you to rearrange the 
hierarchy. Also, as previously mentioned, the designer can execute a hierarchical 
compile methodology in some cases in order to effectively remove the partitioning. 
This hierarchical strategy works well for simplifying scripts but not for improving 
of optimization results. If artificial and incorrect barriers exist in random 
logic paths, the best approach is probably to rearrange the hierarchy either 
within the HDL code or using the ungroup and group commands.

If the HDL code is already written, the designer may wish to experiment with 
the group and ungroup commands before modifying partitions in the source code. 
By using the group and ungroup commands, the designer can determine the impact 
of altering partitions before committing to source code changes.

Ungroup

The ungroup command removes hierarchy. Figure 30 shows a design before and 
after using the ungroup command.

On the left is a design top with three sub-designs. Each sub-design contains 
only one combinational gate. With such small partitions, the optimization process 
is severely restricted. The ungroup command can remove this hierarchy. The 
ungroup command with the -all option removes all of the hierarchy at the current 
design level. At the right, the design is ungrouped and all cells are at the 
top level of the hierarchy.

The ungroup command can ungroup a list of cells in a current design. Also, 
the designer can choose to ungroup not only all of the cells at the current 
level of hierarchy but the entire design hierarchy by using the flat option.

Group

The group command is the reverse of the ungroup command as it creates new levels 
of hierarchy. Figure 31 illustrates the operation of the group command.

The group command allows you to create new levels of hierarchy from the objects 
at the current level. At the left, the design top contains three cells. Two 
of those cells are U1 and U2. Using the group command to create a sub-design 
called new, the designer now has a level of hierarchy below top. Top now contains 
the mux and a sub-design called new, which contains the inverter and the NAND 
gate.

The group command can create a new level of hierarchy from a list of cells, 
all cells except a list of cells, all combinational logic, logic read in as 
a PLA, logic read in as a state machine, a block from an HDL, or a set of bused 
gates read in from an HDL. Grouping HDL blocks is very powerful. This option 
allows you to rearrange your HDL code immediately after reading in the source 
code.

Below are examples of VHDL and Verilog source code:

            VHDL                      Verilog


        myprocess : process         always @ (posedge clk) 

                                               : myblock

     myblock : block

        

Individual HDL blocks can be grouped with the -hdl_block option of the group 
command. To group either a Verilog always block, a VHDL process, or a VHDL 
block, the designer can group using the name of the block. In Figure 26, the 
name of the block is "myblock" for the VHDL block or Verilog always block, 
and "myprocess" for the VHDL process.

Summary

This methodology note discussed various partitioning guidelines for synthesis. 
These recommendations are not rules. Designers can make their own choices on 
where they wish to partition for either increased quality of results, faster 
run times, or simplified scripts. Instead of making drastic changes in the 
HDL source, designers can experiment with various partitions using the group 
and ungroup commands, and analyze the impact of possible partitioning changes.


----- End Included Message -----



Please send us your comments or questions about SOLV-IT! using the feedback form.

Trademarks/Copyright ©1997 Synopsys, Inc. All Rights Reserved. Last Modified: Feb 12, 1997