Distribution of integration sites. (A) Distribution of MMTV (n = 254), MMTV(SIN) (n = 585), MMTV(SIN)arrest (n = 671), HIV-1 (n = 3246) and MLV (n = 885) integration sites within a 20 kb window centered on Transcription Start Sites (TSS). The vertical dashed line shows TSS (0 kb). The integration sites were binned to 1 kb or 3 kb sequence windows and plotted as the percentage of all integrations per kb. The distribution of a set of 100 000 computationally-generated random integration sites is represented by the dotted horizontal line. (B) Integration site preferences of retroviral vectors relative to random control. Tile color code indicates whether the vector integrations departure (red) or not (green) from random control. The numbers represent the Chi-square p-values. P < 0.01 was considered as a significant difference from random dataset. (C) Distribution of integration sites within RefSeq genes (hg 18). Differences between virus integrations and random control were analyzed by a Chi-square test.