Bytecode: Difference between revisions

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
imported>Ringo62
Reverted 1 edit by Elizabeth765 (talk)
 
imported>DreamRimmer bot II
m Standardise list-defined references format (bot)
 
(One intermediate revision by one other user not shown)
Line 25: Line 25:
*[[EBPF]]
*[[EBPF]]
*Berkeley Pascal<ref>{{cite web|last=G.|first=Adam Y.|title=Berkeley Pascal|website=[[GitHub]] |date=2022-07-11|url=https://github.com/adamyg/berkeley_pascal|access-date=2022-01-08}}</ref>
*Berkeley Pascal<ref>{{cite web|last=G.|first=Adam Y.|title=Berkeley Pascal|website=[[GitHub]] |date=2022-07-11|url=https://github.com/adamyg/berkeley_pascal|access-date=2022-01-08}}</ref>
*[[Byte Code Engineering Library]]
*Byte Code Engineering Library
*C to [[Java virtual machine]] compilers
*C to [[Java virtual machine]] compilers
*[[CLISP]] implementation of [[Common Lisp]] used to compile only to bytecode for many years; however, now it also supports compiling to native code with the help of [[GNU lightning]]
*[[CLISP]] implementation of [[Common Lisp]] used to compile only to bytecode for many years; however, now it also supports compiling to native code with the help of [[GNU lightning]]
Line 93: Line 93:
*[[IBM i#TIMI|TIMI]] is used by compilers on the [[IBM i]] platform.
*[[IBM i#TIMI|TIMI]] is used by compilers on the [[IBM i]] platform.
*[[Tiny BASIC#Implementation in a virtual machine|Tiny BASIC]]
*[[Tiny BASIC#Implementation in a virtual machine|Tiny BASIC]]
*[[Visual Basic for Applications]] compiles to bytecode.
*[[Visual FoxPro]] compiles to bytecode
*[[Visual FoxPro]] compiles to bytecode
*[[WebAssembly]]
*[[WebAssembly]]
Line 101: Line 102:
==See also==
==See also==
{{wiktionary|bytecode}}
{{wiktionary|bytecode}}
*[[Intermediate representation]]
* {{Annotated link |Computing platform}}
*[[Platform (computing)]]
* {{Annotated link |Intermediate representation}}
*[[Runtime system]]
* {{Annotated link |Runtime system}}


==Notes==
==Notes==
Line 109: Line 110:


==References==
==References==
{{reflist|refs=
<references>
<ref name="Jucs_Lua">{{cite web |url=http://www.jucs.org/jucs_11_7/the_implementation_of_lua/jucs_11_7_1159_1176_defigueiredo.html |title=The Implementation of Lua 5.0}} (NB. This involves a register-based virtual machine.)</ref>
<ref name="Jucs_Lua">{{cite web |url=http://www.jucs.org/jucs_11_7/the_implementation_of_lua/jucs_11_7_1159_1176_defigueiredo.html |title=The Implementation of Lua 5.0}} (NB. This involves a register-based virtual machine.)</ref>
<ref name="Dalvik">{{Cite web |url=http://source.android.com/tech/dalvik/dalvik-bytecode.html |title=Dalvik VM |url-status=dead |archive-url=https://web.archive.org/web/20130518021154/http://source.android.com/tech/dalvik/dalvik-bytecode.html |archive-date=2013-05-18 |access-date=2012-10-29}} (NB. This VM is register based.)</ref>
<ref name="Dalvik">{{Cite web |url=http://source.android.com/tech/dalvik/dalvik-bytecode.html |title=Dalvik VM |url-status=dead |archive-url=https://web.archive.org/web/20130518021154/http://source.android.com/tech/dalvik/dalvik-bytecode.html |archive-date=2013-05-18 |access-date=2012-10-29}} (NB. This VM is register based.)</ref>
Line 117: Line 118:
<ref name="Javascript">{{Cite web|url=https://2ality.com/2012/01/bytecode-myth.html|title=JavaScript myth: JavaScript needs a standard bytecode|website=2ality.com}}</ref>
<ref name="Javascript">{{Cite web|url=https://2ality.com/2012/01/bytecode-myth.html|title=JavaScript myth: JavaScript needs a standard bytecode|website=2ality.com}}</ref>
<ref name="Arizona_Icom">{{Cite web |url=http://www.cs.arizona.edu/icon/ftp/doc/ib1up.pdf |title=The Implementation of the Icon Programming Language |url-status=dead |archive-url=https://web.archive.org/web/20160305123148/http://www.cs.arizona.edu/icon/ftp/doc/ib1up.pdf |archive-date=5 March 2016 |access-date=9 September 2011}}</ref>
<ref name="Arizona_Icom">{{Cite web |url=http://www.cs.arizona.edu/icon/ftp/doc/ib1up.pdf |title=The Implementation of the Icon Programming Language |url-status=dead |archive-url=https://web.archive.org/web/20160305123148/http://www.cs.arizona.edu/icon/ftp/doc/ib1up.pdf |archive-date=5 March 2016 |access-date=9 September 2011}}</ref>
<ref name="Icon_Unicon">{{Cite web|url=http://unicon.sourceforge.net/book/ib.pdf |archive-url=https://ghostarchive.org/archive/20221009/http://unicon.sourceforge.net/book/ib.pdf |archive-date=2022-10-09 |url-status=live|title=The Implementation of Icon and Unicon a Compendium}}</ref>
<ref name="Icon_Unicon">{{Cite web|url=https://unicon.sourceforge.net/book/ib.pdf |archive-url=https://ghostarchive.org/archive/20221009/http://unicon.sourceforge.net/book/ib.pdf |archive-date=2022-10-09 |url-status=live|title=The Implementation of Icon and Unicon a Compendium}}</ref>
<ref name="Paul_2001_KEYBOARD">{{cite newsgroup |title=KEYBOARD.SYS internal structure |newsgroup=comp.os.msdos.programmer |author-first=Matthias R. |author-last=Paul |date=2001-12-30 |url=https://groups.google.com/d/msg/comp.os.msdos.programmer/l_IuSHsBDWQ/887rJF9IYmMJ |access-date=2016-09-17 |url-status=live |archive-url=https://archive.today/20170909082257/https://groups.google.com/forum/%23!msg/comp.os.msdos.programmer/l_IuSHsBDWQ/887rJF9IYmMJ |archive-date=2017-09-09 |quote=[…] In fact, the format is basically the same in [[MS-DOS]] 3.3&nbsp;- 8.0, [[PC&nbsp;DOS]] 3.3&nbsp;- 2000, including Russian, Lithuanian, Chinese and Japanese issues, as well as in Windows NT, 2000, and XP […]. There are minor differences and incompatibilities, but the general format has not changed over the years. […] Some of the data entries contain normal tables […] However, most entries contain ''executable code'' interpreted by some kind of [[p-code machine|p-code interpreter]] at *[[run time (program lifecycle phase)|runtime]]*, including conditional branches and the like. This is why the [[KEYB (DOS command)|KEYB]] driver has such a huge memory footprint compared to table-driven keyboard drivers which can be done in 3 - 4 Kb getting the same level of function except for the interpreter. […]}}</ref>
<ref name="Paul_2001_KEYBOARD">{{cite newsgroup |title=KEYBOARD.SYS internal structure |newsgroup=comp.os.msdos.programmer |author-first=Matthias R. |author-last=Paul |date=2001-12-30 |url=https://groups.google.com/d/msg/comp.os.msdos.programmer/l_IuSHsBDWQ/887rJF9IYmMJ |access-date=2016-09-17 |url-status=live |archive-url=https://archive.today/20170909082257/https://groups.google.com/forum/%23!msg/comp.os.msdos.programmer/l_IuSHsBDWQ/887rJF9IYmMJ |archive-date=2017-09-09 |quote=[…] In fact, the format is basically the same in [[MS-DOS]] 3.3&nbsp;- 8.0, [[PC&nbsp;DOS]] 3.3&nbsp;- 2000, including Russian, Lithuanian, Chinese and Japanese issues, as well as in Windows NT, 2000, and XP […]. There are minor differences and incompatibilities, but the general format has not changed over the years. […] Some of the data entries contain normal tables […] However, most entries contain ''executable code'' interpreted by some kind of [[p-code machine|p-code interpreter]] at *[[run time (program lifecycle phase)|runtime]]*, including conditional branches and the like. This is why the [[KEYB (DOS command)|KEYB]] driver has such a huge memory footprint compared to table-driven keyboard drivers which can be done in 3 - 4 Kb getting the same level of function except for the interpreter. […]}}</ref>
<ref name="Mendelson_2001_KEYBOARD">{{Cite web |url=http://www.columbia.edu/~em36/wpdos/eurodos.html |title=How to Display the Euro in MS-DOS and Windows DOS |last=Mendelson |first=Edward |author-link=Edward Mendelson |date=2001-07-20 |at=Display the euro symbol in full-screen MS-DOS (including Windows 95 or Windows 98 full-screen DOS) |url-status=live |archive-url=https://web.archive.org/web/20160917201248/http://www.columbia.edu/~em36/wpdos/eurodos.html |archive-date=2016-09-17 |access-date=2016-09-17 |quote=[…] Matthias [R.] Paul […] warns that the [[IBM PC DOS]] version of the keyboard driver uses some internal procedures that are not recognized by the [[Microsoft]] driver, so, if possible, you should use the [[IBM]] versions of both [[KEYB.COM]] and [[KEYBOARD.SYS]] instead of mixing Microsoft and IBM versions […]}} (NB. What is meant by "procedures" here are some additional bytecodes in the IBM KEYBOARD.SYS file not supported by the Microsoft version of the KEYB driver.)</ref>
<ref name="Mendelson_2001_KEYBOARD">{{Cite web |url=http://www.columbia.edu/~em36/wpdos/eurodos.html |title=How to Display the Euro in MS-DOS and Windows DOS |last=Mendelson |first=Edward |author-link=Edward Mendelson |date=2001-07-20 |at=Display the euro symbol in full-screen MS-DOS (including Windows 95 or Windows 98 full-screen DOS) |url-status=live |archive-url=https://web.archive.org/web/20160917201248/http://www.columbia.edu/~em36/wpdos/eurodos.html |archive-date=2016-09-17 |access-date=2016-09-17 |quote=[…] Matthias [R.] Paul […] warns that the [[IBM PC DOS]] version of the keyboard driver uses some internal procedures that are not recognized by the [[Microsoft]] driver, so, if possible, you should use the [[IBM]] versions of both [[KEYB.COM]] and [[KEYBOARD.SYS]] instead of mixing Microsoft and IBM versions […]}} (NB. What is meant by "procedures" here are some additional bytecodes in the IBM KEYBOARD.SYS file not supported by the Microsoft version of the KEYB driver.)</ref>
Line 124: Line 125:
<ref name="SQLite">{{cite web |title=The SQLite Bytecode Engine |url=https://www.sqlite.org/opcode.html |access-date=29 August 2016 |archive-url=https://web.archive.org/web/20170414044139/http://sqlite.org/opcode.html |archive-date=14 April 2017 |url-status=dead }}</ref>
<ref name="SQLite">{{cite web |title=The SQLite Bytecode Engine |url=https://www.sqlite.org/opcode.html |access-date=29 August 2016 |archive-url=https://web.archive.org/web/20170414044139/http://sqlite.org/opcode.html |archive-date=14 April 2017 |url-status=dead }}</ref>
<ref name="Multiplan">{{cite book |title=Microsoft C Pcode Specifications |page=13 |quote=[[Multiplan]] wasn't compiled to [[machine code]], but to a kind of byte-code which was run by an [[interpreter (computing)|interpreter]], in order to make Multiplan portable across the widely varying hardware of the time. This byte-code distinguished between the machine-specific [[floating point format]] to calculate on, and an external (standard) format, which was [[binary-coded decimal|binary coded decimal]] (BCD). The PACK and UNPACK instructions converted between the two.}}</ref>
<ref name="Multiplan">{{cite book |title=Microsoft C Pcode Specifications |page=13 |quote=[[Multiplan]] wasn't compiled to [[machine code]], but to a kind of byte-code which was run by an [[interpreter (computing)|interpreter]], in order to make Multiplan portable across the widely varying hardware of the time. This byte-code distinguished between the machine-specific [[floating point format]] to calculate on, and an external (standard) format, which was [[binary-coded decimal|binary coded decimal]] (BCD). The PACK and UNPACK instructions converted between the two.}}</ref>
}}
</references>


[[Category:Bytecodes| ]]
[[Category:Virtualization]]
[[Category:Virtualization]]
[[Category:Bytecodes|*]]

Latest revision as of 13:54, 18 November 2025

Template:Short description Script error: No such module "Redirect hatnote". Template:More citations needed Template:Use dmy dates Template:Program execution

Bytecode (also called portable code or p-code) is a form of instruction set designed for efficient execution by a software interpreter. Unlike human-readable[1] source code, bytecodes are compact numeric codes, constants, and references (normally numeric addresses) that encode the result of compiler parsing and performing semantic analysis of things like type, scope, and nesting depths of program objects.

The name bytecode stems from instruction sets that have one-byte opcodes followed by optional parameters. Intermediate representations such as bytecode may be output by programming language implementations to ease interpretation, or it may be used to reduce hardware and operating system dependence by allowing the same code to run cross-platform, on different devices. Bytecode may often be either directly executed on a virtual machine (a p-code machine, i.e., interpreter), or it may be further compiled into machine code for better performance.

Since bytecode instructions are processed by software, they may be arbitrarily complex, but are nonetheless often akin to traditional hardware instructions: virtual stack machines are the most common, but virtual register machines have been built also.[2][3] Different parts may often be stored in separate files, similar to object modules, but dynamically loaded during execution.

Execution

A bytecode program may be executed by parsing and directly executing the instructions, one at a time. This kind of bytecode interpreter is very portable. Some systems, called dynamic translators, or just-in-time (JIT) compilers, translate bytecode into machine code as necessary at runtime. This makes the virtual machine hardware-specific but does not lose the portability of the bytecode. For example, Java and Smalltalk code is typically stored in bytecode format, which is typically then JIT compiled to translate the bytecode to machine code before execution. This introduces a delay before a program is run, when the bytecode is compiled to native machine code, but improves execution speed considerably compared to interpreting source code directly, normally by around an order of magnitude (10x).[4]

Because of its performance advantage, today many language implementations execute a program in two phases, first compiling the source code into bytecode, and then passing the bytecode to the virtual machine. There are bytecode based virtual machines of this sort for Java, Raku, Python, PHP,Template:Efn Tcl, mawk and Forth (however, Forth is seldom compiled via bytecodes in this way, and its virtual machine is more generic instead). The implementation of Perl and Ruby 1.8 instead work by walking an abstract syntax tree representation derived from the source code.

More recently, the authors of V8[1] and Dart[5] have challenged the notion that intermediate bytecode is needed for fast and efficient VM implementation. Both of these language implementations currently do direct JIT compiling from source code to machine code with no bytecode intermediary.[6]

Examples

(disassemble '(lambda (x) (print x)))
; disassembly for (LAMBDA (X))
; 2436F6DF:       850500000F22     TEST EAX, [#x220F0000]     ; no-arg-parsing entry point
;       E5:       8BD6             MOV EDX, ESI
;       E7:       8B05A8F63624     MOV EAX, [#x2436F6A8]      ; #<FDEFINITION object for PRINT>
;       ED:       B904000000       MOV ECX, 4
;       F2:       FF7504           PUSH DWORD PTR [EBP+4]
;       F5:       FF6005           JMP DWORD PTR [EAX+5]
;       F8:       CC0A             BREAK 10                   ; error trap
;       FA:       02               BYTE #X02
;       FB:       18               BYTE #X18                  ; INVALID-ARG-COUNT-ERROR
;       FC:       4F               BYTE #X4F                  ; ECX
Compiled code can be analysed and investigated using a built-in tool for debugging the low-level bytecode. The tool can be initialized from the shell, for example:
>>> import dis # "dis" - Disassembler of Python byte code into mnemonics.
>>> dis.dis('print("Hello, World!")')
  1           0 LOAD_NAME                0 (print)
              2 LOAD_CONST               0 ('Hello, World!')
              4 CALL_FUNCTION            1
              6 RETURN_VALUE

See also

Template:Sister project

Notes

Template:Notelist

References

  1. a b Script error: No such module "citation/CS1".
  2. Script error: No such module "citation/CS1". (NB. This involves a register-based virtual machine.)
  3. Script error: No such module "citation/CS1". (NB. This VM is register based.)
  4. Script error: No such module "citation/CS1".
  5. Script error: No such module "citation/CS1".
  6. Script error: No such module "citation/CS1".
  7. Script error: No such module "citation/CS1".
  8. Script error: No such module "citation/CS1".
  9. Script error: No such module "citation/CS1".
  10. Script error: No such module "citation/CS1".
  11. Script error: No such module "citation/CS1".
  12. Script error: No such module "citation/CS1".
  13. Script error: No such module "citation/CS1". (NB. What is meant by "procedures" here are some additional bytecodes in the IBM KEYBOARD.SYS file not supported by the Microsoft version of the KEYB driver.)
  14. Script error: No such module "citation/CS1".
  15. Script error: No such module "citation/CS1".
  16. Script error: No such module "citation/CS1".
  17. Script error: No such module "citation/CS1".