Logo Search packages:      
Sourcecode: cacao-source version File versions  Download package


/* src/vm/jit/verify/typecheck.c - typechecking (part of bytecode verification)

   Copyright (C) 1996-2005, 2006, 2007, 2008
   CACAOVM - Verein zur Foerderung der freien virtuellen Maschine CACAO

   This file is part of CACAO.

   This program is free software; you can redistribute it and/or
   modify it under the terms of the GNU General Public License as
   published by the Free Software Foundation; either version 2, or (at
   your option) any later version.

   This program is distributed in the hope that it will be useful, but
   WITHOUT ANY WARRANTY; without even the implied warranty of
   General Public License for more details.

   You should have received a copy of the GNU General Public License
   along with this program; if not, write to the Free Software
   Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
   02110-1301, USA.



What's the purpose of the `typechecker`?

The typechecker analyses (the intermediate repr. of) the bytecode of
each method and ensures that for each instruction the values on the
stack and in local variables are of the correct type whenever the
instruction is executed.

type checking is a mandatory part of bytecode verification.

How does the typechecker work?

The JVM stack and the local variables are not statically typed, so the
typechecker has to *infer* the static types of stack slots and local
variables at each point of the method. The JVM spec imposes a lot of
restrictions on the bytecode in order to guarantee that this is always

Basically the typechecker solves the data flow equations of the method.
This is done in the usual way for a forward data flow analysis: Starting
from the entry point of the method the typechecker follows the CFG and
records the type of each stack slot and local variable at each point[1].
When two or more control flow paths merge at a point, the union of the
types for each slot/variable is taken. The algorithm continues to follow
all possible paths[2] until the recorded types do not change anymore (ie.
the equations have been solved).

If the solution has been reached and the resulting types are valid for
all instructions, then type checking terminates with success, otherwise
an exception is thrown.

Why is this code so damn complicated?

Short answer: The devil's in the details.

While the basic operation of the typechecker is no big deal, there are
many properties of Java bytecode which make type checking hard. Some of
them are not even addressed in the JVM spec. Some problems and their

*) Finding a good representation of the union of two reference types is
difficult because of multiple inheritance of interfaces. 

      Solution: The typeinfo system can represent such "merged" types by a
      list of proper subclasses of a class. Example:

            typeclass=java.lang.Object merged={ InterfaceA, InterfaceB }
      represents the result of merging two interface types "InterfaceA"
      and "InterfaceB".

*) When the code of a method is verified, there may still be unresolved
references to classes/methods/fields in the code, which we may not force
to be resolved eagerly. (A similar problem arises because of the special
checks for protected members.)

      Solution: The typeinfo system knows how to deal with unresolved
      class references. Whenever a check has to be performed for an
      unresolved type, the type is annotated with constraints representing
      the check. Later, when the type is resolved, the constraints are
      checked. (See the constrain_unresolved_... and the resolve_...

*) Checks for uninitialized object instances are hard because after the
invocation of <init> on an uninitialized object *all* slots/variables
referring to this object (and exactly those slots/variables) must be
marked as initialized.

      Solution: The JVM spec describes a solution, which has been
      implemented in this typechecker.

Note that some checks mentioned in the JVM spec are unnecessary[4] and
not performed by either the reference implementation, or this implementation.

--- Footnotes

[1] Actually only the types of slots/variables at the start of each
basic block are remembered. Within a basic block the algorithm only keeps
the types of the slots/variables for the "current" instruction which is
being analysed. 

[2] Actually the algorithm iterates through the basic block list until
there are no more changes. Theoretically it would be wise to sort the
basic blocks topologically beforehand, but the number of average/max
iterations observed is so low, that this was not deemed necessary.

[3] This is similar to a method proposed by: Alessandro Coglio et al., A
Formal Specification of Java Class Loading, Technical Report, Kestrel
Institute April 2000, revised July 2000 
An important difference is that Coglio's subtype constraints are checked
after loading, while our constraints are checked when the field/method
is accessed for the first time, so we can guarantee lexically correct
error reporting.

[4] Alessandro Coglio
    Improving the official specification of Java bytecode verification
    Proceedings of the 3rd ECOOP Workshop on Formal Techniques for Java Programs
    June 2001

#include "config.h"

#include <assert.h>
#include <string.h>

#include "vm/types.h"


#include "mm/memory.h"

#include "native/native.h"

#include "toolbox/logging.h"

#include "vm/access.h"
#include "vm/array.h"
#include "vm/builtin.h"
#include "vm/exceptions.h"
#include "vm/global.h"
#include "vm/primitive.h"
#include "vm/resolve.h"

#include "vm/jit/jit.h"
#include "vm/jit/parse.h"
#include "vm/jit/show.h"

#include "vmcore/loader.h"
#include "vmcore/options.h"

#include <typecheck-common.h>

/* MACROS FOR VARIABLE TYPE CHECKING                                        */

#define TYPECHECK_CHECK_TYPE(i,tp,msg)                               \
    do {                                                             \
        if (VAR(i)->type != (tp)) {                                  \
            exceptions_throw_verifyerror(state->m, (msg));           \
            return false;                                            \
        }                                                            \
    } while (0)

#define TYPECHECK_INT(i)                                             \
    TYPECHECK_CHECK_TYPE(i,TYPE_INT,"Expected to find integer value")
#define TYPECHECK_LNG(i)                                             \
    TYPECHECK_CHECK_TYPE(i,TYPE_LNG,"Expected to find long value")
#define TYPECHECK_FLT(i)                                             \
    TYPECHECK_CHECK_TYPE(i,TYPE_FLT,"Expected to find float value")
#define TYPECHECK_DBL(i)                                             \
    TYPECHECK_CHECK_TYPE(i,TYPE_DBL,"Expected to find double value")
#define TYPECHECK_ADR(i)                                             \
    TYPECHECK_CHECK_TYPE(i,TYPE_ADR,"Expected to find object value")

#define TYPECHECK_INT_OP(o)  TYPECHECK_INT((o).varindex)
#define TYPECHECK_LNG_OP(o)  TYPECHECK_LNG((o).varindex)
#define TYPECHECK_FLT_OP(o)  TYPECHECK_FLT((o).varindex)
#define TYPECHECK_DBL_OP(o)  TYPECHECK_DBL((o).varindex)
#define TYPECHECK_ADR_OP(o)  TYPECHECK_ADR((o).varindex)

/* typestate_save_invars *******************************************************
   Save the invars of the current basic block in the space reserved by

   This function must be called before an instruction modifies a variable
   that is an invar of the current block. In such cases the invars of the
   block must be saved, and restored at the end of the analysis of this
   basic block, so that the invars again reflect the *input* to this basic
   block (and do not randomly contain types that appear within the block).

       state............current state of the verifier


static void
typestate_save_invars(verifier_state *state)
      s4 i, index;
      s4 *pindex;
      LOG("saving invars");

      if (!state->savedindices) {
            LOG("allocating savedindices buffer");
            pindex = DMNEW(s4, state->m->maxstack);
            state->savedindices = pindex;
            index = state->numlocals + VERIFIER_EXTRA_VARS;
            for (i=0; i<state->m->maxstack; ++i)
                  *pindex++ = index++;

      /* save types */

      typecheck_copy_types(state, state->bptr->invars, state->savedindices, 

      /* set the invars of the block to the saved variables */
      /* and remember the original invars                   */

      state->savedinvars = state->bptr->invars;
      state->bptr->invars = state->savedindices;

/* typestate_restore_invars  ***************************************************
   Restore the invars of the current basic block that have been previously
   saved by `typestate_save_invars`.

       state............current state of the verifier


static void
typestate_restore_invars(verifier_state *state)
      LOG("restoring saved invars");

      /* restore the invars pointer */

      state->bptr->invars = state->savedinvars;

      /* copy the types back */

      typecheck_copy_types(state, state->savedindices, state->bptr->invars,

      /* mark that there are no saved invars currently */

      state->savedinvars = NULL;

/* handle_fieldaccess **********************************************************
       state............the current state of the verifier

       true.............successful verification,
         false............an exception has been thrown.


static bool
handle_fieldaccess(verifier_state *state,
                           varinfo *instance,
                           varinfo *value)
      jitdata *jd;

      jd = state->jd;

#define EXCEPTION  do { return false; } while (0)
#include <typecheck-fields.inc>

      return true;

/* handle_invocation ***********************************************************
   Verify an ICMD_INVOKE* instruction.
       state............the current state of the verifier

       true.............successful verification,
         false............an exception has been thrown.


static bool
handle_invocation(verifier_state *state)
      jitdata *jd;
    varinfo *dv;               /* output variable of current instruction */

      jd = state->jd;
      dv = VAROP(state->iptr->dst);

#define OP1   VAR(state->iptr->sx.s23.s2.args[0])
#include <typecheck-invoke.inc>
#undef  OP1

      return true;

/* handle_builtin **************************************************************
   Verify the call of a builtin method.
       state............the current state of the verifier

       true.............successful verification,
         false............an exception has been thrown.


static bool
handle_builtin(verifier_state *state)
      jitdata *jd;
    varinfo *dv;               /* output variable of current instruction */

      jd = state->jd;
      dv = VAROP(state->iptr->dst);

#define OP1   state->iptr->sx.s23.s2.args[0]
#include <typecheck-builtins.inc>
#undef  OP1

      return true;

/* handle_multianewarray *******************************************************
   Verify a MULTIANEWARRAY instruction.
       state............the current state of the verifier

       true.............successful verification,
         false............an exception has been thrown.


static bool
handle_multianewarray(verifier_state *state)
      jitdata *jd;
    varinfo *dv;               /* output variable of current instruction */

      jd = state->jd;
      dv = VAROP(state->iptr->dst);

#include <typecheck-multianewarray.inc>

      return true;

/* typecheck_invalidate_locals *************************************************
   Invalidate locals that are overwritten by writing to the given local.
       state............the current state of the verifier
         index............the index of the local that is written
         twoword..........true, if a two-word type is written


static void typecheck_invalidate_locals(verifier_state *state, s4 index, bool twoword)
      s4 javaindex;
      s4 t;
      s4 varindex;
      jitdata *jd = state->jd;
      s4 *localmap = jd->local_map;
      varinfo *vars = jd->var;

      javaindex = jd->reverselocalmap[index];

      /* invalidate locals of two-word type at index javaindex-1 */

      if (javaindex > 0) {
            localmap += 5 * (javaindex-1);
            for (t=0; t<5; ++t) {
                  varindex = *localmap++;
                  if (varindex >= 0 && IS_2_WORD_TYPE(vars[varindex].type)) {
                        LOG1("invalidate local %d", varindex);
                        vars[varindex].type = TYPE_VOID;
      else {
            localmap += 5 * javaindex;

      /* invalidate locals at index javaindex */

      for (t=0; t<5; ++t) {
            varindex = *localmap++;
            if (varindex >= 0) {
                  LOG1("invalidate local %d", varindex);
                  vars[varindex].type = TYPE_VOID;

      /* if a two-word type is written, invalidate locals at index javaindex+1 */

      if (twoword) {
            for (t=0; t<5; ++t) {
                  varindex = *localmap++;
                  if (varindex >= 0) {
                        LOG1("invalidate local %d", varindex);
                        vars[varindex].type = TYPE_VOID;

/* macros used by the generated code ******************************************/

#define EXCEPTION          do { return false; } while (0)

#define CHECK_LOCAL_TYPE(index, t)                                   \
    do {                                                             \
        if (!typevector_checktype(jd->var, (index), (t)))            \
             VERIFY_ERROR("Local variable type mismatch");           \
    } while (0)

#define STORE_LOCAL(t, index)                                        \
    do {                                                             \
         s4 temp_t = (t);                                            \
         typecheck_invalidate_locals(state, (index), false);         \
         typevector_store(jd->var, (index), (temp_t), NULL);         \
    } while (0)

#define STORE_LOCAL_2_WORD(t, index)                                 \
    do {                                                             \
         s4 temp_t = (t);                                            \
         typecheck_invalidate_locals(state, (index), true);          \
         typevector_store(jd->var, (index), (temp_t), NULL);         \
    } while (0)

#define REACH_BLOCK(target)                                          \
    do {                                                             \
        if (!typestate_reach(state, (target),                        \
                             state->bptr->outvars, jd->var,          \
                             state->bptr->outdepth))                 \
                return false;                                        \
    } while (0)

#define REACH(target)   REACH_BLOCK((target).block)

/* handle_basic_block **********************************************************
   Perform bytecode verification of a basic block.
       state............the current state of the verifier

       true.............successful verification,
         false............an exception has been thrown.


static bool
handle_basic_block(verifier_state *state)
    int opcode;                                      /* current opcode */
    int len;                        /* for counting instructions, etc. */
    bool superblockend;        /* true if no fallthrough to next block */
      instruction *iptr;                      /* the current instruction */
    basicblock *tbptr;                   /* temporary for target block */
    bool maythrow;               /* true if this instruction may throw */
      s4 i;
      typecheck_result r;
      branch_target_t *table;
      lookup_target_t *lookup;
      jitdata *jd = state->jd;
      exception_entry *ex;
      varinfo constvalue;                               /* for PUT*CONST */
      constant_FMIref *fieldref;

      LOGSTR1("\n---- BLOCK %04d ------------------------------------------------\n",state->bptr->nr);

      superblockend = false;
      state->bptr->flags = BBFINISHED;

      /* prevent compiler warnings */

      /* determine the active exception handlers for this block */
      /* XXX could use a faster algorithm with sorted lists or  */
      /* something?                                             */
      len = 0;
      for (ex = state->jd->exceptiontable; ex ; ex = ex->down) {
            if ((ex->start->nr <= state->bptr->nr) && (ex->end->nr > state->bptr->nr)) {
                  LOG1("active handler L%03d", ex->handler->nr);
                  state->handlers[len++] = ex;
      state->handlers[len] = NULL;

      /* init variable types at the start of this block */
      typevector_copy_inplace(state->bptr->inlocals, jd->var, state->numlocals);

      DOLOG(show_basicblock(jd, state->bptr, SHOW_STACK));
      DOLOG(typecheck_print_vararray(stdout, jd, state->bptr->invars, 
      DOLOG(typevector_print(stdout, jd->var, state->numlocals));

      /* loop over the instructions */
      len = state->bptr->icount;
      state->iptr = state->bptr->iinstr;
      while (--len >= 0)  {

            iptr = state->iptr;

            DOLOG(typevector_print(stdout, jd->var, state->numlocals));
            LOGNL; LOGFLUSH;
            DOLOG(show_icmd(jd, state->iptr, false, SHOW_STACK)); LOGNL; LOGFLUSH;

            opcode = iptr->opc;
            maythrow = false;

            switch (opcode) {

                  /* include generated code for ICMDs verification */

#define STATE  state
#define METHOD (state->m)
#define IPTR   iptr
#define BPTR   (state->bptr)
#include <typecheck-variablesbased-gen.inc>
#undef  STATE
#undef  METHOD
#undef  IPTR
#undef  BPTR

                        LOG1("ICMD %d\n", opcode);
                        TYPECHECK_VERIFYERROR_bool("Missing ICMD code during typecheck");

            /* reach exception handlers for this instruction */

            if (maythrow) {
                  LOG("reaching exception handlers");
                  i = 0;
                  while (state->handlers[i]) {
                        if (state->handlers[i]->catchtype.any)
                              VAR(state->exinvars)->typeinfo.typeclass = state->handlers[i]->catchtype;
                              VAR(state->exinvars)->typeinfo.typeclass.cls = class_java_lang_Throwable;
                        if (!typestate_reach(state,
                                    &(state->exinvars), jd->var, 1))
                              return false;

            LOG("\t\tnext instruction");
      } /* while instructions */

      LOG("instructions done");
      LOGSTR("RESULT=> ");
      DOLOG(typecheck_print_vararray(stdout, jd, state->bptr->outvars,
      DOLOG(typevector_print(stdout, jd->var, state->numlocals));

      /* propagate stack and variables to the following block */
      if (!superblockend) {
            LOG("reaching following block");
            tbptr = state->bptr->next;
            while (tbptr->flags == BBDELETED) {
                  tbptr = tbptr->next;
                  /* this must be checked in parse.c */
                  if ((tbptr->nr) >= state->basicblockcount)
                        TYPECHECK_VERIFYERROR_bool("Control flow falls off the last block");
            if (!typestate_reach(state,tbptr,state->bptr->outvars, jd->var,
                  return false;

      /* We may have to restore the types of the instack slots. They
       * have been saved if an <init> call inside the block has
       * modified the instack types. (see INVOKESPECIAL) */

      if (state->savedinvars)

      return true;

/* typecheck()                                                              */
/* This is the main function of the bytecode verifier. It is called         */
/* directly after analyse_stack.                                            */
/*                                                                          */
/* IN:                                                                      */
/*    meth.............the method to verify                                 */
/*    cdata............codegendata for the method                           */
/*    rdata............registerdata for the method                          */
/*                                                                          */
/* RETURN VALUE:                                                            */
/*     true.............successful verification                             */
/*     false............an exception has been thrown                        */
/*                                                                          */

#define MAXPARAMS 255

bool typecheck(jitdata *jd)
      methodinfo     *meth;
      codegendata    *cd;
      varinfo        *savedlocals;
      verifier_state  state;             /* current state of the verifier */

      /* collect statistics */

      int count_iterations = 0;
      TYPECHECK_COUNTIF(cdata->method->exceptiontablelength != 0,stat_methods_with_handlers);
      state.stat_maythrow = false;

      /* get required compiler data */

      meth = jd->m;
      cd   = jd->cd;

      /* some logging on entry */

    DOLOG( show_method(jd, SHOW_STACK) );
    LOGMETHOD("Entering typecheck: ",cd->method);

      /* initialize the verifier state */

      state.m = meth;
      state.jd = jd;
      state.cd = cd;
      state.basicblockcount = jd->basicblockcount;
      state.basicblocks = jd->basicblocks;
      state.savedindices = NULL;
      state.savedinvars = NULL;

      /* check that the basicblock numbers are valid */

#if !defined(NDEBUG)

      /* check if this method is an instance initializer method */

    state.initmethod = (state.m->name == utf_init);

      /* initialize the basic block flags for the following CFG traversal */

      typecheck_init_flags(&state, BBFINISHED);

    /* number of local variables */
    /* In <init> methods we use an extra local variable to indicate whether */
    /* the 'this' reference has been initialized.                           */
      /*         TYPE_VOID...means 'this' has not been initialized,           */
      /*         TYPE_INT....means 'this' has been initialized.               */

    state.numlocals = state.jd->localcount;
      state.validlocals = state.numlocals;
    if (state.initmethod) 
            state.numlocals++; /* VERIFIER_EXTRA_LOCALS */

            s4 i;
            s4 t;
            for (i=0; i<state.validlocals; ++i) {
                  LOG2("    %i => javaindex %i", i, jd->reverselocalmap[i]);

    /* allocate the buffer of active exception handlers */
    state.handlers = DMNEW(exception_entry*, state.jd->exceptiontablelength + 1);

      /* save local variables */

      savedlocals = DMNEW(varinfo, state.numlocals);
      MCOPY(savedlocals, jd->var, varinfo, state.numlocals);

      /* initialized local variables of first block */

      if (!typecheck_init_locals(&state, true))
            return false;

    /* initialize invars of exception handlers */
      state.exinvars = state.numlocals;
      VAR(state.exinvars)->type = TYPE_ADR;
                                          class_java_lang_Throwable); /* changed later */

    LOG("Exception handler stacks set.\n");

    /* loop while there are still blocks to be checked */
    do {

        state.repeat = false;
        state.bptr = state.basicblocks;

        for (; state.bptr; state.bptr = state.bptr->next) {
            LOGSTR1("---- BLOCK %04d, ",state.bptr->nr);
            LOGSTR1("blockflags: %d\n",state.bptr->flags);
                /* verify reached block */      
            if (state.bptr->flags == BBTYPECHECK_REACHED) {
                if (!handle_basic_block(&state))
                              return false;
        } /* for blocks */

        LOGIF(state.repeat,"state.repeat == true");
    } while (state.repeat);

      /* statistics */
      LOG1("Typechecker did %4d iterations",count_iterations);

      /* reset the flags of blocks we haven't reached */


      /* restore locals */

      MCOPY(jd->var, savedlocals, varinfo, state.numlocals);

      /* everything's ok */

    LOGimp("exiting typecheck");
      return true;
#endif /* ENABLE_VERIFIER */

 * These are local overrides for various environment variables in Emacs.
 * Please do not remove this and leave it at the end of the file, where
 * Emacs will automagically detect them.
 * ---------------------------------------------------------------------
 * Local variables:
 * mode: c
 * indent-tabs-mode: t
 * c-basic-offset: 4
 * tab-width: 4
 * End:
 * vim:noexpandtab:sw=4:ts=4:

Generated by  Doxygen 1.6.0   Back to index