Assessing and improving needle exchange programs: gaps and problems in the literature

Based on an extensive review of the needle exchange program (NEP) literature, the author sought to identify weaknesses in NEP evaluations published to date. Surprisingly, it was apparent that NEP researchers often fail to: (1) collect comparison/change data; (2) agree on appropriate and consistent dependent variables to measure; (3) use valid and reliable measurement instruments; (4) present clear operational definitions; (5) analyze outcome measures by gender/race/social context; (6) provide data on type of drug injected as it relates to risky drug and sexual behaviors; (7) measure HIV/AIDS knowledge; or, (8) clearly articulate desired NEP outcomes. Suggestions for future research are included, and conclusions about the overall state of NEP evaluative research are drawn.


Background
The moral, ethical, and legal debate continues over establishing and maintaining needle exchange programs (NEPs) in the U.S. and abroad. NEP opponents repeatedly call for "proof" of NEP effectiveness, fearing NEP use leads to increased drug use, more crime and more discarded needles, a health concern for the general public at large. There have been thousands of published NEP articles, ranging from popular press accounts of legal battles over NEPs to longitudinal and/or cross-sectional studies conducted over several years with thousands of injection drug users (IDUs).
Given many programs, in many different places, operating with different methods of evaluation, it has been difficult to compare and evaluate the overall success of NEPs [1]. Lacking the definitive, random-assignment clinical trial that is impossible for political, logistical, and ethical reasons, substantiating positive effects has been difficult [2]. To achieve that, and answer the general research ques-tion about the societal impact of NEPs on risky drug and sexual behaviors, both summative interpretive analyses and objective, quantitative meta-analyses of NEP evaluative studies have been undertaken.
At least eight interpretive analyses have been published summarizing the results of studies focused on NEP effectiveness [3][4][5][6][7][8][9][10]. In general, these narrative overviews conclude that NEPs are a useful tool to control the spread of HIV/AIDS.
More recently, a smaller group of researchers, enumerated below, have begun to examine data related to NEP effectiveness via meta-analysis, a statistical technique for quantitative, objective analysis of the results across studies, rather than an interpretive one. Des Jarlais et al. [11] used meta-analytic techniques to combine the results of three studies related to NEP effectiveness, and their results indicated that participation in NEPs was associated with a lower rate of HIV infection. Cross, Saunders, and Bartelli [12] examined U.S. and international data from ten NEP effectiveness studies conducted between 1984 and 1995, and concluded that NEPs were associated with reductions in needle sharing. Ksobiech [13] conducted a meta-analysis of 47 studies focusing on needle sharing, lending, and borrowing behaviors of NEP attenders, taken from data gathered between 1984 and 1997, and also found an inverse relationship between NEP attendance and risky needle-related behaviors.
To quantitatively extend NEP outcome measurements beyond needle sharing, Ksobiech [14] conducted a metaanalysis of NEP effectiveness studies in both the United States and abroad. A total of 64 studies, measuring 83 different dependent variables related to risky drug and sex behaviors as well as HIV/AIDS knowledge, were involved. Because of the multitude of variables, eleven separate meta-analyses were run on conceptually similar variables. Sample size for any given meta-analysis ranged from a low of 2,880 for HIV/AIDS knowledge to a high of 50,423 for injection frequency. The study found desirable societal outcomes associated with NEP attendance longitudinally, cross-sectionally, and by frequency of NEP attendance.
While the general conclusion of all these summary-type projects is favorable with respect to NEP effectiveness, they have also noted problems in areas such as generalizability, control, validity, and reliability. To address these concerns, the writer of this paper sought to review, in general terms, the NEP literature and determine whether these concerns are significant overall, or simply someone's reservations regarding a few selected NEP studies.
In order to complete a series of investigations not presented or discussed herein, all available NEP evaluative studies conducted from January 1988 to July 2001 were gathered and reviewed. To that end, a variety of databases in which NEP evaluative studies might appear were searched for relevant citations. To begin the process of reviewing the bibliography, citations that were clearly unrelated to the purpose described here were eliminated (e.g., the publication was not an academic source, such as AIDS Weekly Plus, or abstract information indicated that the published work was an editorial, letter, or less than one page in length). Many such articles were available on-line, and could be quickly examined before being eliminated. That initial review, as well as the elimination of duplicate citations, narrowed the number of bibliographic entries to approximately 3,000. Via reading abstracts and summaries, that number was again reduced to approximately 500 articles, which served as the basis for the observations and conclusions drawn in this article.
The 500 articles were then examined to identify the positive and negative aspects of the NEP evaluative literature taken as a whole. While there were indeed many desirable NEP outcomes, primarily focused on the impact that NEPs have in controlling the spread of HIV/AIDS via reductions in needle sharing, those results are not summarized herein. Rather, the remainder of this paper focuses on enumerating problems, primarily methodological, which are relatively common in NEP evaluative studies, presenting a number of suggestions for future research, as well as offering some overall conclusions regarding NEP research.

Lack of comparison/change data
Perhaps the most striking problem in the NEP evaluative literature was that many of the studies were primarily descriptive. In study after study, information regarding hours of NEP operation, number/type of locations, number of IDUs served, number of needles distributed and returned, IDU employment status, age, gender, and educational background was summarized, as if this were an annual report to stakeholders. While that information can be useful, there was often no data whatever regarding NEP effectiveness in reducing risky drug and sexual behaviors, or increasing HIV/AIDS knowledge. Indeed, there was often no comparative data whatever; even the limited descriptive information presented was often not compared to the prior year.
Clearly, researchers must gather comparison and/or change data regarding a given NEP. How do NEP attenders differ from non-attenders? How have the risky drug and sexual behaviors of NEP attenders changed over time, if at all? Is frequency of NEP attendance associated with desirable behavioral outcomes, such as a reduction in needle sharing?
NEP research is well past the merely descriptive types of studies represented in the bulk of the examined publications. The point has been quite well made that many nee-dles are distributed and returned, that many IDUs are served, and that condoms and informational brochures are sometimes distributed to IDUs. Surprisingly, at least to this researcher, was the fact that only 64 NEP evaluative studies contained change and/or comparison evaluative data of any kind on the relative effectiveness of these programs (a full listing of these studies is available upon request of the author). Reporting on IDU needle sharing behavior at a fixed point in time does little to advance the body of knowledge regarding NEP effectiveness, and permits others, whose intentions are not always friendly with respect to NEPs, to interpret the limited available evidence in a fashion supportive of their particular viewpoint.

Lack of agreement on appropriate and consistent dependent variables across studies
There is great disparity across NEP evaluative studies regarding what information should be gathered from IDUs. This disparity in dependent variables occurs in various ways, and limits comparisons across studies. For example, some researchers gather data on injection frequency per month [15], while others ask about injection frequency per day [16], thus making comparisons across studies difficult at best. Some report and analyze raw data while others collapse presumably ratio or interval data into ordinal categories for subsequent analysis. While it is not wrong per se, the net result is studies with clearly different, non-comparable outcome measures.
Even when the same dependent variables are assessed, it is not uncommon to find those assessments made at different times, especially in follow-up evaluations (30 days vs. 90 days vs. 120 days, etc.). Can't there be agreement as to what is the appropriate length of time between baseline and follow-up measurements across studies?
Further, distinctions between types of sharing partners were made by some researchers, but not by most. Does it really matter if one lends a needle to a running partner as opposed to a sexual partner or friend [17]? If so, shouldn't this line of questioning be incorporated into more, if not all, NEP research? Frankly, one may legitimately question whether IDUs can accurately recognize, remember, and report such subtle distinctions as those described.

Lack of data on reliability and validity of measurement instruments
At present, there does not appear to be a frequently utilized series of standard questions and/or measurement instruments employed to consistently assess various NEPrelated outcomes across studies. Instead, one typically finds little more than a description of the questionnaire and/or measurement instrument with little, if any, supporting data regarding validity or reliability. Did the scales actually measure the concept in question? Did the test actually assess the HIV/AIDS knowledge level of the IDUs? Readers are left to draw their own conclusions time and time again. Thus, with unanswered questions in the areas of reliability and validity as well as researchers utilizing different questionnaires and measurement instruments across multiple studies, it is not surprising to find there are wide-ranging results on any given concept. Comparing results across studies, synthesizing results, and constructing theories to predict, explain and control the manner in which NEPs ought to be utilized becomes problematic.

Lack of clear operational definitions
Perhaps the wide-ranging results reported in NEP evaluations are related to unclear operational definitions of dependent variables. For example, what exactly is meant by the term "needle sharing," as used within and across NEP evaluations? Does it mean the NEP client participating in the study injected a drug with a needle that had been used by another person who is present? Must the person be present? Must they have already used the needle? If so, hasn't the client (or IDU) actually "borrowed" the needle from that person and thus the behavior could be labeled as "needle borrowing" as opposed to "needle sharing"? If an IDU who is an NEP client "lends" a needle to another person, is that conceptually the same as "sharing," or must that needle already have been used by the IDU prior to its "going over" to another person in order to qualify as lending? How exactly do we meaningfully differentiate among sharing, borrowing and lending in a consistent fashion across NEP studies? The answer, of course, is via the same operational definition, and yet the literature reviewed herein demonstrates that those distinctions have not yet been made within and/or across multiple NEP evaluations.

Need for a category system/typology
The 64 change/comparison studies identified in this literature search provided data on 83 separate dependent variables. Ksobiech [14] has suggested a typology which placed these dependent variables into a series of categories developed to conceptually and theoretically organize the variables: needle sharing, needle sharing-extended (includes variations on the basic needle sharing question); lending/borrowing behaviors; risky circumstances/ context; injection frequency; HIV rate; drug paraphernalia-sharing; drug preparation behaviors; syringe use; sexual risk behaviors; and disease/HIV knowledge. Table 1 (see Additional File 1) presents each category and a complete listing of the variables placed within it, and illustrates the plethora of different variables assessed by different NEP researchers. Additional work in this area is needed. Agreement across researchers on the category scheme to be used, and the variables contained within each category, would do much to standardize the infor-mation gathered and enhance comparability across studies.

Lack of data analysis by gender/race/social context
While NEP evaluations present basic demographic, they typically go no further, failing to examine risky drug and or sexual behaviors, while controlling for gender, race, or social context, even though prior research those could be moderator variables. Campbell [18] reported that female IDUs are more likely to be dependent upon male sexual partners for drugs and equipment, and prefer injecting with sexual partners. Further, women have an economic disadvantage in these relationships; male sexual partners are also often women's drug suppliers. Because of their lower status in the drug-using hierarchy, women are also more likely to be the last person to inject a drug, consequently using equipment and paraphernalia already "dirtied" by others [18].
Social contexts, and the interpersonal relationships found within them, may also be predictors of drug and sexual risk behaviors. For example, males frequently use shooting galleries, leading to sharing drug paraphernalia, reusing injection equipment, and/or injecting more frequently than they would in less risky environments [19]. Miller, Eskild, Mella, Moi and Magnus [20] found that women reported a higher rate of injection frequency, although not a greater use of NEPs.
Siegal et al. [21] found that there were both geographic and ethnic differences on choices of drug used. Most NEP studies examined did not report on drugs used by race/ ethnicity. Therefore, it would be a valuable addition to NEP articles for the outcome data to be summarized, if not statistically analyzed, by gender, as well as by race/ethnicity (particularly in the U.S.), in order to further refine the results, and allow for comparisons across studies with similar refinements.

Lack of information on types of drugs used and risky drug behaviors
Injecting different drugs leads to different effects, and is related to the frequency of injection necessary to maintain that effect. When Bruneau, Lamothe, et al. [22] found unexpectedly riskier drug behaviors among NEP clients as compared with non-attenders, they searched for an explanation. One plausible explanation could be related to the availability of particular drugs during the times IDUs are being assessed. In a follow-up commentary, Bruneau, Franco, and Lamothe [23] state that "cocaine bingeing in the context of shooting galleries can create situations of suboptimal utilization of sterile injection equipment" (p. 1009), possibly also impacting HIV incidence.
If type of drugs used is, in fact, a confounding variable, why haven't NEP researchers routinely asked and reported on type of drugs injected, and their relationship to other risky drug behaviors, such as injection frequency? In a non-NEP study, Singer, Himmelgreen, Dushay and Weeks [24] found that geographic location, ethnicity and type of drug injected combine as a predictor of injection frequency. Watters, Estilo, Clark and Lorvick [25] found that cocaine injection was a predictor of syringe sharing. Gathering more of this type of information for NEP attenders specifically may lead to the creation of a variety of intervention programs for IDUs who inject themselves with particular types of drugs.

Lack of information and/or activity regarding risky sexual behaviors
While NEP advocates often include reducing risky sexual behaviors and increasing HIV risk knowledge as NEP goals, the bulk of the evidence gathered to date has been in the area of risky drug behaviors. Only 13 studies measured a change/comparison in behavioral outcomes related to condom use, sex partners, or sex work (see, for example, Archibald et al. [26]; Hart et al. [27]; and Latta [28]).
Risky sexual behaviors of IDUs are becoming an increasing source of the spread of HIV/AIDS beyond the IDU population [29]. Thus, providing clean needles is not enough to stem the tide of the epidemic in this dual transmission risk population. Rather, a more all-encompassing approach, including an emphasis on diminishing risky sexual behaviors, needs to be implemented, and its outcomes measured. Always using clean needles, while simultaneously engaging in unprotected sex, places IDUs, and the wider population associated with those IDUs, at disproportionate risk for acquiring and spreading HIV/ AIDS.

Lack of information on HIV/AIDS knowledge among IDUs
Most of the risk behavior models and/or theories suggest (see, for example, AIDS Risk Reduction Model [30]; or Theory of Reasoned Action [31]) that HIV/AIDS knowledge must be present and related risky behaviors "labeled" as such before an IDU will consider changing his/her behavioral intent, and, ultimately, his/her behavior. To omit or minimize the study of HIV risk knowledge with NEP attenders, whether intentionally or not, is to reduce the probability that key variables can be appropriately structured and included in theoretic models of the future. Providing appropriate, factual, and relevant information to NEP clients appears to be an implied goal of most NEPs, although data on relative success in transmitting such data to the IDU population is woefully inadequate. Only three of the 64 studies examined in detail for the Ksobiech [14] meta-analysis provided any change/ comparison data on HIV disease knowledge outcome measures [15,32,33].

Lack of clearly articulated desirable NEP outcomes
In reviewing the outcome measures of NEP researchers, it was not always clear whether a given result related to NEP attendance was desirable or not. Consider, for example, the dependent variable, "uses bleach," presumably as a means of cleaning needles and/or syringes. If bleach use by NEP participants declines over time, is that a negative or positive outcome?
One reasonable interpretation for a decline in "using bleach" is that NEP attenders are more likely to use sterile needles, lessening their need for bleach. If that is one's view, then a decline in "using bleach" is, in fact, a positive outcome. On the other hand, if one interprets "using bleach" as part and parcel of desirable IDU behaviors, its decline in NEP clients (or less frequent use, when compared to non-NEP clients) would necessarily be interpreted as a negative effect.
This ambiguity in NEP outcomes is especially prevalent in studies measuring changes in drug paraphernalia sharing behaviors. Beyond "using bleach," other examples include "boiling water" and "always cleaning the needle before use." Again, it could certainly be argued that a decline in such behaviors is related to use of sterile needles, rather than reusing one's own or others' needles in a risky manner.
In summary, although there are obvious differences in the resources available to a given NEP, there will also be corresponding differences in their behavioral goals over time. There is a definite need for the NEP community to articulate common goals in clear, concise terms, so that data being collected evaluate progress toward these goals, and can be compared, not only within the United States, but also internationally.

Coordination of research studies
NEPs vary considerably in size and scope, not to mention community acceptance and legality. Nearly all the research discussed here was limited to one or several NEPs in a specific geographic region. Further, most of the studies appear to be trying to answer this general research question: Is a given NEP "successful"? That question leads to data collection procedures that are primarily focused on information presumably deemed essential to demonstrate NEP effectiveness to funding sources and/or governmental units.
To date, there has been little effort to link studies across cities, states, and beyond in a manner that would maxi-mize comparability. It would be helpful, for example, if 10 NEPs, located in major urban areas of the United States, cooperated in implementing a series of multi-site, longitudinal studies, utilizing the same dependent variables, measured via the same operational definitions, and then statistically analyzed individually and cumulatively.
Replication of studies NEP research has thus far not been geared toward replicating prior studies or utilizing the measurement instruments of others. While the NEP studies analyzed do attempt to measure similar outcomes, the bulk of the studies appear to be designed in isolation from each other and, in many cases, almost appear to be purposefully different from one another. There is an overall need for replication of NEP studies by location, particularly those that found uncharacteristically large desirable [34] or undesirable [22] effects.
At present, it's not possible to say with certainty that some studies' results are "outliers" due to poor methodology, an aberrant sample, or if, in fact, NEP attenders in that particular location behave differently than NEP participants elsewhere in the world. These questions, and others, point to the need for rigorous replication of previously published studies.

Use/effectiveness of IDU-related communication messages
Scant reference, if any, was made to additional literature provided to NEP attendees at exchange sites. Presumably, the intent of such literature would be knowledge-oriented. The impact of such literature on NEP attenders is another apparently unexplored area. Are print materials provided being read, or are they merely discarded? How could such materials be designed for enhanced readability and/or impact? Research related to message design of supplemental NEP materials is indicated.
Beyond what is handed out at NEP sites, there are significant questions about the design and effectiveness of broadcast public service announcements (PSAs) in the areas of knowledge and behavior. Given injection drug use is an illegal activity, it is unlikely that there will be PSAs about safe injection behaviors broadcast in the mainstream media anytime soon. Radio PSAs may be more effective in targeting this "hard to reach" audience. Groundbreaking work needs to be done in this area.
Further, at NEP sites, there may be an opportunity to incorporate risk-reduction messages through looped, brief, videotaped programs. Creating meaningful messages for this target population is an area to explore. Given that NEP attenders are typically less educated, designing and implementing an alternative to print materials may result in greater impact.

Role of NEP workers/volunteers
There is some necessary interaction between NEP client and NEP staff in virtually all circumstances, regardless of whether the NEP is fixed vs. roving, or legal vs. illegal, unless it is a syringe dispenser method of distributing needles. Given this face-to-face interaction, this could well be, at the least, a "teachable moment." Assuming that one goal of NEPs is building and maintaining a trusting relationship over time between NEP clients and workers/volunteers, to encourage IDU behavioral modification and ultimately drug treatment, research needs to be done to examine this potentially critical/key relationship. Often former drug users themselves, NEP staff already has a common ground with NEP clients. It is possible that this relationship can be enhanced via training NEP staff in persuasion compliance strategies.

Differences in types of NEPs
Little comparative information was found on the relative effectiveness of various types of NEPs (i.e., fixed sites, mobile vans, outreach workers). Many studies examined NEPs that provide needles at both fixed and roving sites, but the results were usually combined. It's possible, as suggested by Guydish et al. [35], that roving NEP attracts a different IDU population than fixed sites. Data need to be gathered to explore that possibility. Such an effort might lead to employing different communication, education, and long-term treatment strategies for fixed-site clients vs. those frequenting mobile/roving units.

Conclusions
(1) We should be interested in improving_NEPs, not merely justifying any given NEP's existence by reporting on basics such as needles distributed/returned. Betterstated NEP goals, as previously discussed, will assist in moving NEP outcome evaluations toward this direction.
(2) We need to improve coordination and communication within the NEP research community. Indeed, it would appear as though researchers go out of our way to be certain that a given study's data is not similar to anyone else's, making comparisons across studies difficult at best.
(3) We need to broaden our perspective on NEP evaluative research. More research is needed in areas such as risky sexual behaviors and even the most fundamental of all: HIV/AIDS knowledge.
(4) We need a category system/typology, within which discernibly different dependent variables are considered to be equivalent. (5) For all the studies, all the effort, all the publications, we know surprisingly little about relationships between and among the multitude of variables related to HIV/ AIDS prevention. As we talk with one another, use the same DVs, defined the same way, and measure them with valid/reliable instruments, that scenario should improve. Until then, we'll just be "going through the motions," pretending that we're moving forward in this critical area.