WO1988005190A1 - Systeme microprogrammable d'emulation de langages - Google Patents

Systeme microprogrammable d'emulation de langages Download PDF

Info

Publication number
WO1988005190A1
WO1988005190A1 PCT/US1987/003444 US8703444W WO8805190A1 WO 1988005190 A1 WO1988005190 A1 WO 1988005190A1 US 8703444 W US8703444 W US 8703444W WO 8805190 A1 WO8805190 A1 WO 8805190A1
Authority
WO
WIPO (PCT)
Prior art keywords
instruction
token
processor
tokens
program
Prior art date
Application number
PCT/US1987/003444
Other languages
English (en)
Inventor
Arthur E. Speckhard
Joseph M. Thames
Original Assignee
International Meta Systems, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Meta Systems, Inc. filed Critical International Meta Systems, Inc.
Priority to PCT/US1987/003444 priority Critical patent/WO1988005190A1/fr
Priority to EP19880900933 priority patent/EP0343171A4/en
Publication of WO1988005190A1 publication Critical patent/WO1988005190A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/32Address formation of the next instruction, e.g. by incrementing the instruction counter
    • G06F9/322Address formation of the next instruction, e.g. by incrementing the instruction counter for non-sequential address
    • G06F9/328Address formation of the next instruction, e.g. by incrementing the instruction counter for non-sequential address for runtime instruction patching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • G06F8/31Programming languages or programming paradigms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/22Microcontrol or microprogram arrangements
    • G06F9/226Microinstruction function, e.g. input/output microinstruction; diagnostic microinstruction; microinstruction format
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/22Microcontrol or microprogram arrangements
    • G06F9/24Loading of the microprogram
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines

Definitions

  • the present invention generally relates to a system for execution of high level computer language programs. More particularly, it relates to a system for emulating high level computer languages and executing programs written in such languages.
  • Computer languages have traditionally been divided into two classes: programming languages and machine languages. As the name implies, machine languages have been used to refer to machine elements of computers and have been directed to each action to be performed by the computers for which they are written.
  • Programming languages are generally considered to be "high level” languages because their operators and constructs correspond more to properties of applications or problems to be solved rather than actual machine functions or physical elements. Because high level languages typically do not make reference to actual machine hardware, high level languages have been translated into machine languages in order for computers to execute the statements of high level languages.
  • the traditional method for translating high level languages has been to compile several machine language instructions for each high level language statement. Multiple machine language instructions have been used because the logic of high level languages typically cannot be expressed in one-to- one correspondence with machine language instructions .
  • a disadvantage of compiling several machine language instructions for each high level language statement is that most computers execute only one instruction at a time. Moreover, each instruction must be fetched from a memory location one at. a time, presenting a "bottleneck" between memory access time and machine processing speed.
  • VLSI very large scale integrated
  • Microfiche appendices which constitute a part of this specification, are as follows: MICROFICHE APPENDIX A is a computer program listing of the encoder program of the preferred embodiment, contained on three microfiche having 201 frames; and
  • MICROFICHE APPENDIX B is a computer program listing of the emulator program of the preferred embodiment, contained on 2 microfiche having 162 frames.
  • the present invention solves the problems associated with the compiling of high level languages by encoding such languages in variable length tokens representative of characteristics intrinsic to such languages.
  • the tokens are then executed by a processor which is microprogrammed to emulate a high level language.
  • the system of the present invention may be used independently or in conjunction with a host computer system, such as an IBM PC AT or equivalent system, to provide high speed processing of application programs written in a multiplicity of high level languages.
  • a host computer system such as an IBM PC AT or equivalent system
  • the processor of the present invention is microprogrammable, encoder and emulator programs written for specific languages may be used as microcode for the processor.
  • FIG. 1 is a block diagram of apparatus of the preferred embodiment
  • FIG. 2 is a block diagram of a program procedure
  • FIG. 3 is a logic flow-chart showing the word padding function of the preferred embodiment
  • FIG. 4 is a sample listing. of tokens of the preferred embodiment, which are used to represent a high level language statement
  • FIG. 5 is a conceptualized depiction of the branching operation of the preferred embodiment
  • FIGs. 6a and 6b are graphic representations of alternative instruction formats of the processor of the preferred embodiment.
  • FIG. 7 is a block diagram of the processor of the preferred embodiment.
  • An expansion board 10 is added to a host computer 20, such as an IBM PC AT or equivalent system.
  • the expansion board 10 includes a main memory 30, a processor 40, a cache memory 50, an instruction memory 60, an interface 70 with the host computer 20 and bus and control lines 80.
  • the main memory 30 is a dynamic random access memory unit having a storage capacity of approximately 1 megabyte and information is stored in the general purpose memory 30 in 32-bit words.
  • An encoder program is loaded into the instruction memory 60 by the host computer 20 through the interface 70 and bus lines 80.
  • the instruction memory is a loadable read only memory (ROM) unit having a storage capacity of 64K 32- bit words.
  • ROM read only memory
  • a listing of a representative encoder program is attached to this disclosure as Microfiche Appendix A and incorporated by reference.
  • the encoder program causes the processor 40 to fetch statements of the high level language program from the main memory 30,. encode each statement in a representative stream of variable length bit fields ("tokens") without regard to word boundaries, and store the encoded statements in the main memory 30.
  • an emulator program is loaded into the instruction memory 60 by the host computer 20 through the interface 70 and bus lines 80.
  • a listing of a representative emulator program is attached to this disclosure as Microfiche Appendix B and incorporated by reference.
  • the emulator program causes the microprocessor to fetch the encoded statements from the main memory 30 and interprets the tokens of the encoded statements into microcode instructions resulting in execution of the high level language. That is, the instruction tokens executed by the emulator program have a syntax and data structure resembling the syntax and data structure of the high level language rather than the physical elements of the processor. Instruction tokens are packed into the main memory 30 and subsequently fetched and executed by the processor 40.
  • Each procedure 100 has a specific function to perform and typically comprises three major parts: a header 110; a code body 120; and a data contour 130.
  • the header 110 contains a table which describes the bit size and storage locations of the code body 120 and, the data contour 130.
  • the code body 120 contains the logic statements of the procedure and the data contour 130 contains data or storage locations of data to be used in the procedure.
  • absolute addresses ar not used to provide location references to elements of the code body 12.0 or data contour 130. Rather, all location references are relative to fixed positions in the data contour 130 or the header 110. Thus, machine references are not directly made. An exception exists, however, for global references which are used to refer to other procedures and to common or global variables.
  • Each statement of the high level language is represented in a stream of instruction tokens generally without regard to word size, again allowing for independence from machine constraints.
  • Each token is a variable length bit field representable by an integer pair having the following general form.
  • the first integer of the pair is a constant of the emulator program, fixed by the order of tokens in the language representation, which indicates the length of the token in bits.
  • the second integer is the value of the token, representative of an operator or operand corresponding to characteristics that are intrinsic to the execution of the high level language.
  • the integer pair provides a switching context within the emulator program potentially having a branch for each integer in the set of all integers representable by the token of the specified length. That is, during execution, the emulator program interprets each token to cause the processor 40 to branch to locations in analagous fashion to a FORTRAN computed GO TO statement or a PASCAL CASE statement.
  • token stream format is presented herein in a left-to-right ordering but in actual implementation, the ordering is right-to-left. That is, token streams fill memory words of the main memory 30 beginning at the least significant bit (bit position 0) of a memory word and ending at the most significant bit (bit position 31).
  • bit position 0 the least significant bit
  • bit position 31 the most significant bit
  • the tokens are generated without regard to word size, they must be packed in memory according to the word boundaries of the main memory 30, that is, 32 bits. Accordingly, as shown in FIG. 3, the number of bits remaining in a memory word is calculated (200) for each token stored.
  • the first integer in each primary operator token representation indicates that the instruction token is six bits in length. This integer is not part of the instruction token stream, but is part of the emulator program which executes the token stream. Its value is predictable according to the ordering of tokens in the language representation.
  • the second integer indicates which statement of the high level language is being encoded. For each of these operators there are suboperations corresponding to the suboperations of the high level language, thus providing isomorphic representation of the high level language. A common suboperation is the
  • an operand which contains a blend of operators and operands.
  • An operand may in turn correspond to syntax names or literals as in the high level language.
  • Operands are token structures corresponding to syntax names that are used to refernece the, content of data structures. However, operand references are not simply memory addresses. Rather, the mapping of references to data content is a dynamic process. Two kinds of operand refernces are employed in code-body 120 expressions: variable references and name references. Variable references are references to variable data structures while name referneces correspond to names of vqariable data structures. In general, variable references are used to retrieve. data, whereas name references are used to store data. A name is considered a literal.
  • the second field of the name reference is actually a type code ⁇ 4:3 ⁇ referring to type "name," since a name is a literal data structure.
  • the reference structure has a four- bit class field, a variable format primary reference and may have a secondary reference relative to the primary one (common variables only).
  • the primary reference is an index to one of the tables in the contour 130 portion of the object program, as designated by the class field.
  • the four-bit class field and associated class designations are listed in TABLE IV below:
  • the primary reference format has a two bit length code and a value subfield.
  • the length of the value subfield is encoded in the length specification shown in TABLE V below:
  • the preceding class code and the length code are contiguous in the same word so that a zero length code may be distinguished from a zero pad, mentioned above.
  • the value represents a relative pointer into one of the tables in the contour 130 as specified by the class field.
  • the length code applies only to the block-value.
  • the offset value is a "word encoded" literal, meaning that its size is determined by the number of remaining bits in the word containing the preceding fields. If enough bits remain to contain the value (which cannot be zero), these bits are used as the offset-value field. Otherwise, the remaining bits are zero, and the next full word is used as the offset field.
  • Literal operands and operand data structures have similar formats. Literal operands exist in the code body 120. Operand data structures, if they exist prior to execution, are stored in the data contour 130. Otherwise, they are created and stored dynamically and are referenced indirectly through the data contour 130 tables. Data structures are prefaced by a four-bit type code, as shown in TABLE VIII below:
  • Such data structures possess great variability in length and may not fit in a single word of memory. If not, the entire value, including the sign bit is stored in the next memory word.
  • the low order bit of the data structure represents the sign of the value, containing a 0 for a positive sign or a 1 for a negative sign.
  • the literal value of zero has a negative sign bit (1) to distinguish it from a 0 pad field, which is ignored by the emulator program.
  • ⁇ 000001 ⁇ represents a literal value of zero
  • ⁇ 000000 ⁇ represent a pad field.
  • the statement 300 is a FORTRAN assignment statement assigning the sum of 10 + 5 to the variable A.
  • the first token 310 is a primary operator having a length of six bits and indicating an assignment statement, as shown in TABLE I above.
  • the second token 320 is a literal operand discriminator, as shown in TABLE II above, which initiates an expression.
  • the third token 330 indicates that the literal operand is an integer, as shown in TABLE VIII above.
  • the fourth token 340 has a length of 20 bits to fill the remainder of a 32 bit memory word and a value of 20 which indicates that its true value of 10 has been shifted to add a positive sign bit to the zero bit position of the memory word. Such single position shifting has the appearance of multiplying the value by 2 since it is encoded in binary form.
  • the fifth token 350 is another literal operand discriminator to preface a literal value.
  • the sixth token 360 indicates that the literal operand is an integer and the seventh token 370 has a length of 26 bits to fill the remainder of a 32-bit memory word.
  • the literal value of the seventh token 370 is 10 which indicates that a positive sign bit was added to the memory word to shift the true value of 5 by 1 bit.
  • the eighth token 380 is an operator discriminator to indicate that an operation is to be performed in the expression. It follows the literal operand token groups described above to indicate Polish post-fix notation. That is, the operator is applied to operands which precede it. During execution of the encoded statement, the literal operands are loaded into a push-down stack so that operators are applied to them on a last-in-first-out basis.
  • the ninth token 390 indicates an addition operation is to be performed, as shown in TABLE III above.
  • the tenth token 400 is another operator discriminator and the eleventh token 410 indicates an end to the expression, as shown in TABLE III above.
  • the twelfth token 420 initiates a new expression containing a literal operand as shown in TABLE II.
  • the thirteenth token 430 indicates that the literal operand is a name reference as shown in TABLE VIII. (The structure of this reference is shown in TABLE VI, in which its first two tokens correspond to 420 and 430 in FIG. 4.)
  • the fourteenth token 440 is the class code of the name reference, indicating the name of a local variable as shown in TABLE IV.
  • the fifteenth token 450 is the length code of the reference as shown in TABLE V, indicating that the subsequent (sixteenth) value token 460 is 5 bits in length.
  • the value token 460 is a relative pointer to the location of the local variable in the data contour 130, as shown in FIG. 2. Its value, one, indicates the location of the first local variable (local variable A 130G1) in the Local Variables Table 130G of the data contour 130 located by the local variables offset pointer in the Header 110.
  • the seventeenth token 470 is another operator discriminator and the eighteenth token 480 is an Expression-End operator, as shown in TABLE III.
  • the integer pair symbolizing each token represents a logical switching context in the- emulator program containing a branch corresponding to each integer in the set of integers representable by the token of the specified length.
  • the branching operation of the emulator program may be conceptualized as a logical tree system.
  • the token stream defines a path through a hierarchy of integer pairs, which achieves the execution of the statement.
  • the primary operator token, ⁇ 6:1 ⁇ is a member of the highest order set which represents the statement context of the high-level language.
  • the integer one (500), indicated by the value of the token, designates a branch to the Assign (primary) operator, as shown in TABLE I.
  • the Assign operator automatically invokes expression subcontext which begins with discriminator subcontext, as shown in TABLE II.
  • the length of the discriminator token (510) allows three branches (zero values are invalid in most contexts since zeros are used as padding, causing the emulator program to proceed to the next word).
  • the value of the discriminator token (510), three, selects the third branch indicating that a literal data structure follows.
  • the literal data structure begins with a type subcontext as shown in TABLE VIII.
  • the length of the type token (520), four bits, allows up to 15 type branches (excluding the zero value).
  • the value, four, of the type token (520) selects the integer type.
  • the literal data structure is concluded with the value of the integer. If enough bits remain in the word containing the preceding tokens to represent the integer value, then the remainder of the word is used. Otherwise, the remainder of the word is zero-filled (and ignored by the emulator program) and the subsequent full word is used (as a token) to contain the value of the integer.
  • twenty bits remain in the 32-bit word, and are used as the value token.
  • the value itself consists of a zero in the low order bit (the sign bit) indicating a plus sign,, and the value 10 in the high-order nineteen bits.
  • the characteristics of the high level language are thus directly represented by the content and ordering of token streams. Branches are made isomorphically to the high level language so it is unnecessary to compile multiple machine instructions to interpret the logic of high level language statements. Rather, such statements are executed in direct correspondence with their intrinsic characteristics as represented by the token streams.
  • An emulator program such as shown in Microfiche Appendix B and incorporated herein, is loaded into the instruction memory 50 and causes the processor 40 to execute token streams.
  • the emulator program is written specifically to support the language of the program being executed. That is, each high level language requires its own emulator program in order to provide direct interpretation of tokens via micro-programmed branching.
  • the emulator program interprets token streams using microcode instructions of the machine language of the processor 40.
  • the processor 40 has an instruction set of 24 hardware operations that may be combined into a composite (dual) instruction format having a left hand side (LHS) and a right hand side (RHS), as shown in FIG. 6a.
  • the composite instruction is 32 bits wide and contains seven fields. Alternatively, the instruction may have only a LHS and contain only five fields, as shown in FIG. 6b.
  • the LHS portion of an instruction contains an arithmetic, logic or shift operation between two operands with the result assigned to a third operand,
  • the RHS portion of an instruction contains a second operatipn, including external bus instructions, subroutine link/return skips, transfers, or memory indexing.
  • a three address instruction format is used for LHS operations, having the following symbolic form:
  • A 10 + 5
  • the T field of an LHS instruction specifies a register containing the local variable "A”
  • the A field specifies a register containing the value of 10
  • the B field specifies a register containing the value of 5.
  • the processor 40 of the preferred embodiment is contained on a single very large scale integrated circuit (VLSI) silicon chip 800. It executes instructions in a "pipeline" manner in four phases Ph I, Ph II, Ph III and Ph IV. That is, four sequential instructions are concurrently executed in one of the four phases, each phase being one clock cycle in duration. As execution of an instruction is completed, it exits the pipeline and a new instruction enters it. The intermediate instructions simultaneously advance to their next phase of execution. During the first phase Ph I, an instruction is fetched from the instruction memory 50 and loaded into an instruction register 810. The address of the instruction is indicated by a location counter 830 which is either incremented sequentially or given values by transfer, return or conditional transfer instructions.
  • VLSI very large scale integrated circuit
  • the instruction is then decoded in the second phase Ph II by an instruction decoder 820.
  • the LHS is decoded for the Operator, A and B fields but not for the T field, which is passed unaltered to the next phase.
  • the A and B fields designate registers in a general register file 840, or literal registers 850 and 860, whose values are passed to an A Register 870 and a B Register 880.
  • the Operator field is passed to a Opcode Decoder 890 which determines which operation is specified by the Operator field.
  • the RHS is also decoded during the second phase Ph II, but only the Unconditional Transfer, Link, Link Conditional, Return and Load K Register instructions are acted upon during the second phase. All other RHS instructions are passed to the next phase.
  • the address field of the instruction is used to select the next value for the location counter 830, which references the address for the instruction to be fetched immediately after the instruction in Ph I advances.
  • the Link and Link Conditional instructions are executed in similar fashion to the Unconditional Transfer instruction but in addition they "push” the accompanying location counter value onto the Link Stack Register 900 for subsequent use by the Return instruction. That instruction "pops" a value from the Link Stack Register 900 and adds the value of its address field to to the popped value. The sum is used as the next value of the location counter 830.
  • the Load K Register instruction loads its address field into the K Register 910, which is a special purpose register used in conjunction with a Memory Address Register 920 to reference the cache memory 60.
  • the third phase Ph III only the LHS instruction is acted upon.
  • the values selected in the second phase Ph II are operated upon as specified by the Operator field and the result is sent to an X Register 930, which gives the succeeding instruction access to that result when it advances in the next clock cycle.
  • the result of the R3 + R2 operation is stored in the X Register 930 and then added to R1 when the follow up instruction advances into the third phase Ph III.
  • the result of the LHS instruction held in the X Register 930 is stored in a general register or special register as specified by the T field. It is also used for conditional transfer or skip testing by the Test and Skip Logic 940. If a transfer or skip is indicated by result, the address field of the instruction is used to select the next value of the location counter 830 and causes the instructions in the first three phases to be inhibited.
  • the processor 40 communicates with the cache memory 60 through a Cache Memory Interface 950. It communicates with external systems through External Bus Logic 960 which is connected to a 32-bit-wide bi-directional bus 970.
  • the interface 70 between the processor 40 and the host computer 20 is connected to the external bus 970 and responds to commands under control of the program in the instruction memory 50.
  • the external bus 970 is 32-bits wide, only bits 15-0 and parity bits 1-0 are used for communicating with the interface 70.
  • Two read and two write commands are implemented as follows:
  • the letters "ss” refer to the subsystem address of the interface 70.
  • the Write Data function causes 16 bits of data to be sent from the processor 40 to the interface 70. A "data" bit in a status register of the interface 70 is also set to indicate that data has been transmitted. However, if either the data bit or control is already set the operation is deferred.
  • the Write Control function causes 16 bits to be sent from the processor 40 to interface 70 and sets a control bit in the interface status register if neither the data bit or control bit is already set. If either bit is already set the operation is deferred.
  • the Read Data Register function causes the contents of a 16-bit data register of the interface 70 to be sent to the processor 40.
  • the contents of the interface data register are loaded by the host computer 20 as either data or control information.
  • the data/control bit of the interface status register is then reset unless neither the data or control bit is set. In that event, the operation is deferred.
  • the Read Status function causes the interface 70 to send to the processor 40 the contents of the interface status register in a format as shown in TABLE XI:
  • the Read Status function is never deferred.
  • the host computer 20 communicates with the processor 40 as if it were an input/output device. Accordingly, the interface to the host computer 20 is formatted according to standard input/output commands of the host computer.
  • a data processing system executes a high level computer language program by encoding statements of the progra into variable length tokens and then executing the tokens.
  • Each token is a variable length bit field having a value represe tative of a semantic element of a program statement and a length representative of the context of the semantic element.
  • VLSI very large scale integrated circuit
  • the present invention generally relates to a system for execution of high level computer language programs. More particularly, it relates to a system for emulating high level computer languages and executing programs written in such languages.
  • Computer languages have traditionally been divided into two classes: programming languages and machine languages. As the name implies, machine languages have been used to refer to machine elements of computers and have been directed to each action to be performed by the computers for which they are written.
  • Programming languages are generally considered to be "high level” languages because their operators and constructs correspond more to properties of applications or problems to be solved rather than actual machine functions or physical elements. Because high level languages typically do not make reference to actual machine hardware, high level languages have been translated into machine languages in order for computers to execute the statements of high level languages.
  • the traditional method for translating high level languages has been to compile several machine language instructions for each high level language statement. Multiple machine language instructions have been used because the logic of high level languages typically cannot be expressed in one-to- one correspondence with machine language instructions.
  • a disadvantage of compiling several machine language instructions for each high level language statement is that most computers execute only one instruction at a time. Moreover, each instruction must be fetched from a memory location one at a time, presenting a "bottleneck” between memory access time and machine processing speed.
  • VLSI very large scale integrated
  • MICROFICHE APPENDIX A is a computer program listing of the encoder program of the preferred embodiment, contained on three microfiche having 201 frames;
  • MICROFICHE APPENDIX B is a computer program listing of the emulator program of the preferred embodiment, contained on 2 microfiche having 162 frames.
  • the present invention solves the problems associated with the compiling of high level languages by encoding such languages in variable length tokens representative of characteristics intrinsic to such languages.
  • the tokens are then executed by a processor which is microprogrammed to emulate a high level language.
  • the system of the present invention may be used independently or in conjunction with a host computer system, such as an IBM PC AT or equivalent system, to provide high speed processing of application programs written in a multiplicity of high level languages. Because the processor of the present invention is microprogrammable, encoder and emulator programs written for specific languages may be used as microcode for the processor.
  • FIG. 1 is a block diagram of apparatus of the preferred embodiment
  • FIG. 2 is a block diagram of a program procedure
  • FIG. 3 is a logic flow-chart showing the word padding function of the preferred embodiment
  • FIG. 4 is a sample listing of tokens of the preferred embodiment, which are used to represent a high level language statement
  • FIG. 5 is a conceptualized depiction of the branching operation of the preferred embodiment
  • FIGs. 6a and 6b are graphic representations of alternative instruction formats of the processor of the preferred embodiment.
  • FIG. 7 is a block diagram of the processor of the preferred embodiment.
  • An expansion board 10 is added to a host computer 20, such as an IBM PC AT or equivalent system.
  • the expansion board 10 includes a main memory 30, a processor 40, a cache memory 50, an instruction memory 60, an interface 70 with the host computer 20 and bus and control lines 80.
  • the main memory 30 is a dynamic random access memory unit having a storage capacity of approximately 1 megabyte and information is stored in the general purpose memory 30 in 32-bit words.
  • An encoder program is loaded into the instruction memory 60 by the host computer 20 through the interface 70 and bus lines 80.
  • the instruction memory is a loadable read only memory (ROM) unit having a storage capacity of 64K 32- bit words.
  • ROM read only memory
  • a listing of a representative encoder program is attached to this disclosure as Microfiche Appendix A and incorporated by reference.
  • the encoder program causes the processor 40 to fetch statements of the high level language program from the main memory 30, encode each statement in a representative stream of variable length bit fields ("tokens") without regard to word boundaries, and store the encoded statements in the main memory 30.
  • an emulator program is loaded into the instruction memory 60 by the host computer 20 through the interface 70 and bus lines 80.
  • a listing of a representative emulator program is attached to this disclosure as Microfiche Appendix B and incorporated by reference.
  • the emulator program causes the microprocessor to fetch the encoded statements from the main memory 30 and interprets the tokens of the encoded statements into microcode instructions resulting in execution of the high level language. That is, the instruction tokens executed by the emulator program have a syntax and data structure resembling the syntax and data structure of the high level language rather than the physical elements of the processor. Instruction tokens are packed into the main memory 30 and subsequently fetched and executed by the processor 40.
  • Each procedure 100 has a specific function to perform and typically comprises three major parts: a header 110; a code body 120; and a data contour 130.
  • the header 110 contains a table which describes the bit size and storage locations of the code body 120 and the data contour 130.
  • the code body 120 contains the logic statements of the procedure and the data contour 130 contains data or storage locations of data to be used in the procedure.
  • absolute addresses are not used to provide location references to elements of the code body 120 or data contour 130. Rather, all location references are relative to fixed positions in the data contour 130 or the header 110. Thus, machine references are not directly made. An exception exists, however, for global references which are used to refer to other procedures and to common or global variables.
  • Each statement of the high level language is represented in a stream of instruction tokens generally without regard to word size, again allowing for independence from machine constraints.
  • Each token is a variable length bit field representable by an integer pair having the following general form.
  • the first integer of the pair is a constant of the emulator program, fixed by the order of tokens in the language representation, which indicates the length of the token in bits.
  • the second integer is the value of the token, representative of an operator or operand corresponding to characteristics that are intrinsic to the execution of the high level language.
  • the integer pair provides a switching context within the emulator program potentially having a branch for each integer in the set of all integers representable by the token of the specified length. That is, during execution, the emulator program interprets each token to cause the processor 40 to branch to locations in analagous fashion to a FORTRAN computed GO TO statement or a PASCAL CASE statement.
  • token stream format is presented herein in a left-to-right ordering but in actual implementation, the ordering is right-to-left. That is, token streams fill memory words of the main memory 30 beginning at the least significant bit (bit position 0) of a memory word and ending at the most significant bit (bit position 31).
  • bit position 0 the least significant bit
  • bit position 31 the most significant bit
  • the tokens are generated without regard to word size, they must be packed in memory according to the word boundaries of the main memory 30, that is, 32 bits. Accordingly, as shown in FIG. 3, the number of bits remaining in a memory word is calculated (200) for each token stored.
  • the first integer in each primary operator token representation indicates that the instruction token is six bits in length. This integer is not part of the instruction token stream, but is part of the emulator program which executes the token stream. Its value is predictable according to the ordering of tokens in the language representation.
  • the second integer indicates which statement of the high level language is being encoded. For each of these operators there are suboperations corresponding to the suboperations of the high level language, thus providing isomorphic representation of the high level language. A common suboperation is the
  • an operand which contains a blend of operators and operands.
  • An operand may in turn correspond to syntax names or literals as in the high level language.
  • Operands are token structures corresponding to syntax names that are used to refernece the content of data structures. However, operand references are not simply memory addresses. Rather, the mapping of references to data content is a dynamic process. Two kinds of operand refernces are employed in code-body 120 expressions: variable references and name references. Variable references are references to variable data structures while name referneces correspond to names of vqariable data structures. In general, variable references are used to retrieve data, whereas name references are used to store data. A name is considered a literal.
  • the second field of the name reference is actually a type code ⁇ 4:3 ⁇ referring to type "name," since a name is a literal data structure.
  • the reference structure has a four- bit class field, a variable format primary reference and may have a secondary reference relative to the primary one
  • the primary reference is an index to one of the tables in the contour 130 portion of the object program, as designated by the class field.
  • the four-bit class field and associated class designations are listed in TABLE IV below:
  • the primary reference format has a two bit length code and a value subfield.
  • the length of the value subfield is encoded in the length specification shown in TABLE V below:
  • the preceding class code and the length code are contiguous in the same word so that a zero length code may be distinguished from a zero pad, mentioned above.
  • the value represents a relative pointer into one of the tables in the contour 130 as specified by the class field.
  • the length code applies only to the block-value.
  • the offset value is a "word encoded" literal, meaning that its size is determined by the number of remaining bits in the word containing the preceding fields. If enough bits remain to contain the value (which cannot be zero), these bits are used as the offset-value field. Otherwise, the remaining bits are zero, and the next full word is used as the offset field.
  • Literal operands and operand data structures have similar formats. Literal operands exist in the code body 120. Operand data structures, if they exist prior to execution, are stored in the data contour 130. Otherwise, they are created and stored dynamically and are referenced indirectly through the data contour 130 tables. Data structures are prefaced by a four-bit type code, as shown in TABLE VIII below:
  • Such data structures possess great variability in length and may not fit in a single word of memory. If not, the entire value, including the sign bit is stored in the next memory word.
  • the low order bit of the data structure represents the sign of the value, containing a 0 for a positive sign or a 1 for a negative sign.
  • the literal value of zero has a negative sign bit (1) to distinguish it from a 0 pad field, which is ignored by the emulator program.
  • ⁇ 000001 ⁇ represents a literal value of zero
  • ⁇ 000000 ⁇ represents a pad field.
  • the statement 300 is a FORTRAN assignment statement assigning the sum of 10 + 5 to the variable A.
  • the first token 310 is a primary operator having a length of six bits and indicating an assignment statement, as shown in TABLE I above.
  • the second token 320 is a literal operand discriminator, as shown in TABLE II above, which initiates an expression.
  • the third token 330 indicates that the literal operand is an integer, as shown in TABLE VIII above.
  • the fourth token 340 has a length of 20 bits to fill the remainder of a 32 bit memory word and a value of 20 which indicates that its true value of 10 has been shifted to add a positive sign bit to the zero bit position of the memory word. Such single position shifting has the appearance of multiplying the value by 2 since it is encoded in binary form.
  • the fifth token 350 is another literal operand discriminator to preface a literal value.
  • the sixth token 360 indicates that the literal operand is an integer and the seventh token 370 has a length of 26 bits to fill the remainder of a 32-bit memory word.
  • the literal value of the seventh token 370 is 10 which indicates that a positive sign bit was added to the memory word to shift the true value of by 1 bit.
  • the eighth token 380 is an operator discriminator to indicate that an operation is to be performed in the expression. It follows the literal operand token groups described above to indicate Polish post-fix notation. That is, the operator is applied to operands which precede it. During execution of the encoded statement, the literal operands are loaded into a push-down stack so that operators are applied to them on a last-in-first-out basis.
  • the ninth token 390 indicates an addition operation is to be performed as shown in TABLE III above.
  • the tenth token 400 is another operator discriminator and the eleventh token 410 indicates an end to the expression, as shown in TABLE III above.
  • the twelfth token 420 initiates a new expression containing a literal operand as shown in TABLE II.
  • the thirteenth token 430 indicates that the literal operand is a name reference as shown in TABLE VIII. (The structure of this reference is shown in TABLE VI, in which its first two tokens correspond to 420 and 430 in FIG. 4.)
  • the fourteenth token 440 is the class code of the name reference, indicating the name of a local variable as shown in TABLE IV.
  • the fifteenth token 450 is the length code of the reference as shown in TABLE V, indicating that the subsequent (sixteenth) value token 460 is 5 bits in length.
  • the value token 460 is a relative pointer to the location of the local variable in the data contour 130, as shown in FIG. 2. Its value, one, indicates the location of the first local variable (local variable A 130G1) in the Local Variables Table 130G of the data contour 130 located by the local variables offset pointer in the Header 110.
  • the seventeenth token 470 is another operator discriminator and the eighteenth token 480 is an Expression-End operator, as shown in TABLE III.
  • the integer pair symbolizing each token represents a logical switching context in the emulator program containing a branch corresponding to each integer in the set of integers representable by the token of the specified length.
  • the branching operation of the emulator program may be conceptualized as a logical tree system.
  • the token stream defines a path through a hierarchy of integer pairs, which achieves the execution of the statement.
  • the primary operator token (6:1), is a member of the highest order set which represents the statement context of the high-level language.
  • the integer one (500), indicated by the value of the token, designates a branch to the Assign, (primary) operator, as shown in TABLE I.
  • the Assign operator automatically invokes expression subcontext which begins with discriminator subcontext, as shown in TABLE II.
  • the length of the discriminator token (510) allows three branches (zero values are invalid in most contexts since zeros are used as padding, causing the emulator program to proceed to the next word).
  • the value of the discriminator token (510), three, selects the third branch indicating that a literal data structure follows.
  • the literal data structure begins with a type subcontext as shown in TABLE VIII.
  • the length of the type token (520), four bits, allows up to 15 type branches (excluding the zero value).
  • the value, four, of the type token (520) selects the integer type.
  • the literal data structure is concluded with the value of the integer. If enough bits remain in the word containing the preceding tokens to represent the integer value, then the remainder of the word is used. Otherwise, the remainder of the word is zero-filled (and ignored by the emulator program) and the subsequent full word is used (as a token) to contain the value of the integer.
  • twenty bits remain in the 32-bit word, and are used as the value token.
  • the value itself consists of a zero in the low order bit (the sign bit) indicating a plus sign, and the value 10 in the high-order nineteen bits.
  • the characteristics of the high level language are thus directly represented by the content and ordering of token streams. Branches are made isomorphically to the high level language so it is unnecessary to compile multiple machine instructions to interpret the logic of high level language statements. Rather, such statements are executed in direct correspondence with their intrinsic characteristics as represented by the token streams.
  • An emulator program such as shown in Microfiche Appendix B and incorporated herein, is loaded into the instruction memory 50 and causes the processor 40 to execute token streams.
  • the emulator program is written specifically to support the language of the program being executed. That is, each high level language requires its own emulator program in order to provide direct interpretation of tokens via micro-programmed branching.
  • the emulator program interprets token streams using microcode instructions of the machine language of the processor 40.
  • the processor 40 has an instruction set of 24 hardware operations that may be combined into a composite (dual) instruction format having a left hand side (LHS) and a right hand side (RHS), as shown in FIG. 6a.
  • the composite instruction is 32 bits wide and contains seven fields. Alternatively, the instruction may have only a LHS and contain only five fields, as shown in FIG. 6b.
  • the LHS portion of an instruction contains an arithmetic, logic or shift operation between two operands with the result assigned to a third operand.
  • the RHS portion of an instruction contains a second operation, including external bus instructions, subroutine link/return skips, transfers, or memory indexing.
  • a three address instruction format is used for LHS operations, having the following symbolic form:
  • A 10 + 5
  • the T field of an LHS instruction specifies a register containing the local variable "A”
  • the A field specifies a register containing the value of 10
  • the B field specifies a register containing the value of 5.
  • the processor 40 of the preferred embodiment is contained on a single very large scale integrated circuit (VLSI) silicon chip 800. It executes instructions in a "pipeline" manner in four phases Ph I, Ph II, Ph III and Ph IV. That is, four sequential instructions are concurrently executed in one of the four phases, each phase being one clock cycle in duration. As execution of an instruction is completed, it exits the pipeline and a new instruction enters it. The intermediate instructions simultaneously advance to their next phase of execution. During the first phase Ph I, an instruction is fetched from the instruction memory 50 and loaded into an instruction register 810. The address of the instruction is indicated by a location counter 830 which is either incremented sequentially or given values by transfer, return or conditional transfer instructions.
  • VLSI very large scale integrated circuit
  • the instruction is then decoded in the second phase Ph II by an instruction decoder 820.
  • the LHS is decoded for the Operator, A and B fields but not for the T field, which is passed unaltered to the next phase.
  • the A and B fields designate registers in a general register file 840, or literal registers 850 and 860, whose values are passed to an A Register 870 and a B Register 880.
  • the Operator field is passed to a Opcode Decoder 890 which determines which operation is specified by the Operator field.
  • the RHS is also decoded during the second phase PhII, but only the Unconditional Transfer, Link, Link Conditional, Return and Load K Register instructions are acted upon during the second phase. All other RHS instructions are passed to the next phase.
  • the address field of the instruction is used to select the next value for the location counter 830, which references the address for the instruction to be fetched immediately after the instruction in Ph I advances.
  • the Link and Link Conditional instructions are executed in similar fashion to the Unconditional Transfer instruction but in addition they "push” the accompanying location counter value onto the Link Stack Register 900 for subsequent use by the Return instruction. That instruction "pops" a value from the Link Stack Register 900 and adds the value of its address field to to the popped value. The sum is used as the next value of the location counter 830.
  • the Load K Register instruction loads its address field into the K Register 910, which is a special purpose register used in conjunction with a Memory Address Register 920 to reference the cache memory 60.
  • the third phase Ph III only the LHS instruction is acted upon.
  • the values selected in the second phase Ph II are operated upon as specified by the Operator field and the result is sent to an X Register 930, which gives the succeeding instruction access to that result when it advances in the next clock cycle.
  • the result of the R3 + R2 operation is stored in the X Register 930 and then added to R1 when the follow up instruction advances into the third phase Ph III.
  • the result of the LHS instruction held in the X Register 930 is stored in a general register or special register as specified by the T field. It is also used for conditional transfer or skip testing by the Test and Skip Logic 940. If a transfer or skip is indicated by result, the address field of the instruction is used to select the next value of the location counter 830 and causes the instructions in the first three phases to be inhibited.
  • the processor 40 communicates with the cache memory 60 through a Cache Memory Interface 950. It communicates with external systems through External Bus Logic 960 which is connected to a 32-bit-wide bi-directional bus 970.
  • the interface 70 between the processor 40 and the host computer 20 is connected to the external bus 970 and responds to commands under control of the program in the instruction memory 50.
  • the external bus 970 is 32-bits wide, only bits 15-0 and parity bits 1-0 are used for communicating with the interface 70.
  • Two read and two write commands are implemented as follows:
  • the letters "ss” refer to the subsystem address of the interface 70.
  • the Write Data function causes 16 bits of data to be sent from the processor 40 to the interface 70. A "data" bit in a status register of the interface 70 is also set to indicate that data has been transmitted. However, if either the data bit or control is already set the operation is deferred.
  • the Write Control function causes 16 bits to be sent from the processor 40 to interface 70 and sets a control bit in the interface status register if neither the data bit or control bit is already set. If either bit is already set the operation is deferred.
  • the Read Data Register function causes the contents of a 16-bit data register of the interface 70 to be sent to the processor 40.
  • the contents of the interface data register are loaded by the host computer 20 as either data or control information.
  • the data/control bit of the interface status register is then reset unless neither the data or control bit is set. In that event, the operation is deferred.
  • the Read Status function causes the interface 70 to send to the processor 40 the contents of the interface status register in a format as shown in TABLE XI:
  • the Read Status function is never deferred.
  • the host computer 20 communicates with the processor 40 as if it were an input/output device. Accordingly, the interface to the host computer 20 is formatted according to standard input/output commands of the host computer.

Abstract

Un système de traitement de données exécute un programme d'ordinateur en langage évolué en codant des instructions du programme en des symboles de longueur variable et en exécutant ensuite les symboles. Chaque symbole est un champ binaire de longueur variable dont la valeur représente un élément sémantique d'une instruction du programme et dont la longueur représente le contexte de l'élément sémantique. Un seul processeur (40) à circuits intégrés à très grande échelle (VLSI) est microprogrammé pour exécuter en pipeline le programme codé et peut être utilisé en combinaison avec un ordinateur hôte (20), tel qu'un IBM AT ou un système équivalent.
PCT/US1987/003444 1987-01-06 1987-12-28 Systeme microprogrammable d'emulation de langages WO1988005190A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/US1987/003444 WO1988005190A1 (fr) 1987-01-06 1987-12-28 Systeme microprogrammable d'emulation de langages
EP19880900933 EP0343171A4 (en) 1987-12-28 1987-12-28 Microprogrammable language emulation system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US000,714 1987-01-06
PCT/US1987/003444 WO1988005190A1 (fr) 1987-01-06 1987-12-28 Systeme microprogrammable d'emulation de langages

Publications (1)

Publication Number Publication Date
WO1988005190A1 true WO1988005190A1 (fr) 1988-07-14

Family

ID=42123138

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US1987/003444 WO1988005190A1 (fr) 1987-01-06 1987-12-28 Systeme microprogrammable d'emulation de langages

Country Status (2)

Country Link
EP (1) EP0343171A4 (fr)
WO (1) WO1988005190A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5930512A (en) * 1996-10-18 1999-07-27 International Business Machines Corporation Method and apparatus for building and running workflow process models using a hypertext markup language

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4390946A (en) * 1980-10-20 1983-06-28 Control Data Corporation Lookahead addressing in a pipeline computer control store with separate memory segments for single and multiple microcode instruction sequences
US4437184A (en) * 1981-07-09 1984-03-13 International Business Machines Corp. Method of testing a data communication system
US4456952A (en) * 1977-03-17 1984-06-26 Honeywell Information Systems Inc. Data processing system having redundant control processors for fault detection
US4499535A (en) * 1981-05-22 1985-02-12 Data General Corporation Digital computer system having descriptors for variable length addressing for a plurality of instruction dialects
US4506325A (en) * 1980-03-24 1985-03-19 Sperry Corporation Reflexive utilization of descriptors to reconstitute computer instructions which are Huffman-like encoded
US4724521A (en) * 1986-01-14 1988-02-09 Veri-Fone, Inc. Method for operating a local terminal to execute a downloaded application program

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4456952A (en) * 1977-03-17 1984-06-26 Honeywell Information Systems Inc. Data processing system having redundant control processors for fault detection
US4506325A (en) * 1980-03-24 1985-03-19 Sperry Corporation Reflexive utilization of descriptors to reconstitute computer instructions which are Huffman-like encoded
US4390946A (en) * 1980-10-20 1983-06-28 Control Data Corporation Lookahead addressing in a pipeline computer control store with separate memory segments for single and multiple microcode instruction sequences
US4499535A (en) * 1981-05-22 1985-02-12 Data General Corporation Digital computer system having descriptors for variable length addressing for a plurality of instruction dialects
US4437184A (en) * 1981-07-09 1984-03-13 International Business Machines Corp. Method of testing a data communication system
US4724521A (en) * 1986-01-14 1988-02-09 Veri-Fone, Inc. Method for operating a local terminal to execute a downloaded application program

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP0343171A4 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5930512A (en) * 1996-10-18 1999-07-27 International Business Machines Corporation Method and apparatus for building and running workflow process models using a hypertext markup language

Also Published As

Publication number Publication date
EP0343171A1 (fr) 1989-11-29
EP0343171A4 (en) 1991-01-30

Similar Documents

Publication Publication Date Title
US4297743A (en) Call and stack mechanism for procedures executing in different rings
US5077657A (en) Emulator Assist unit which forms addresses of user instruction operands in response to emulator assist unit commands from host processor
US6564179B1 (en) DSP emulating a microcontroller
RU2137184C1 (ru) Отображение с помощью мультинаборов команд
US4587612A (en) Accelerated instruction mapping external to source and target instruction streams for near realtime injection into the latter
US5781758A (en) Software emulation system with reduced memory requirements
Rafiquzzaman Microprocessors and microcomputer-based system design
EP0092610A2 (fr) Procédé de subdivision du jeu d'instructions d'un ordinateur pour l'émulation par microprocesseur de ce dernier
US5455955A (en) Data processing system with device for arranging instructions
US4305124A (en) Pipelined computer
JPH02502589A (ja) マイクロプログラム可能な言語エミュレートシステム
US5150468A (en) State controlled instruction logic management apparatus included in a pipelined processing unit
US4434462A (en) Off-chip access for psuedo-microprogramming in microprocessor
US4325121A (en) Two-level control store for microprogrammed data processor
US4005391A (en) Peripheral interrupt priority resolution in a micro program data processor having plural levels of subinstruction sets
US20100011191A1 (en) Data processing device with instruction translator and memory interface device to translate non-native instructions into native instructions for processor
Coleman et al. The mobile programming system, Janus
US6012138A (en) Dynamically variable length CPU pipeline for efficiently executing two instruction sets
US5920722A (en) System and process for efficiently determining absolute memory addresses for an intermediate code model
US5034879A (en) Programmable data path width in a programmable unit having plural levels of subinstruction sets
WO1988005190A1 (fr) Systeme microprogrammable d'emulation de langages
EP0013291B1 (fr) Système de commande pour l'appel d'instructions dans un ordinateur
Bhandarkar Architecture management for ensuring software compatibility in the VAX family of computers
EP0305752B1 (fr) Largeur de bus de données programmable dans une unité programmable à plusieurs niveaux de jeux de sous-instructions
EP0134386A2 (fr) Méthode et appareil pour exécuter des instructions en language object compilé d'une source en language à haut niveau

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): JP

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): DE FR GB IT

COP Corrected version of pamphlet

Free format text: ON PAGE 28;THE DATE OF RECEIPT OF THE AMENDED CLAIMS SHOULD READ "880523"INSTEAD OF "880524"

WWE Wipo information: entry into national phase

Ref document number: 1988900933

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1988900933

Country of ref document: EP

WWW Wipo information: withdrawn in national office

Ref document number: 1988900933

Country of ref document: EP