During the 1999-2000 school year, the Preston County Education Association (PCEA) conducted a "Morale Survey" of its membership. One of the questions on that survey was as follows:
"If you believe that you are suffering from low morale now, which of the following have been contributing factors?"
The respondent was given a series of statements with which to either agree or disagree. The second most commonly identified factor contributing to low morale was "Extreme emphasis on Stanford Test results making the test more important than the students."
In an attempt to clarify the issue, the Preston County Board of Education asked the PCEA to perform a follow-up survey of its membership. This follow-up survey asked five questions, and the respondent composed his/her own reply:
1) Give specific examples of how the "extreme emphasis" has translated into extra and unreasonable demands placed on you outside the classroom.
2) What unreasonable tasks are you required to do in the classroom which you otherwise would not do as a result of this "extreme emphasis?"
3) What unreasonable modifications have you made to your lesson plans in order to comply with this extreme emphasis?
4) If the reasons for low morale related to Stanford tests are not explained by the above three broad areas of concern, then what is it about these tests which has you so upset?
5) If you object to standardized tests in general, what other instrument should the BOE, parents, and taxpayers use to gauge the performance of its schools?
Seventy survey documents were returned from this follow up survey which serve as the basis for this commentary. This commentary addresses a specific subset of concerns -- specifically those dealing with West Virginia Board of Education Instructional Goals and Objectives (IGOs) and the use of the Stanford 9 examination (SAT 9) as the basis for assessing mastery of those objectives.
The most frequently cited objection in regards to "extreme emphasis on the Stanford Test" is that teachers have been directed to teach from specific Instructional Goals and Objectives (IGOs) rather than the teacher's own curriculum goals. (31 comments). Examples of this sentiment, often expressed in terms such as "teaching the test -- the test is the curriculum," are as follows:
2-24 The curricular emphasis is based on IGOs (i.e., only teach IGOs that are on the test - teach the other stuff after testing.
3-11 I am encouraged to teach tested IGO's. They are the priority - not student needs.
3-31 All curriculum has to be taught around the IGOs. There is not enough time to teach a child to be well rounded. My objective are strictly IGOs.
4-43 The amount of emphasis placed on teaching IGOs make the program "splintered." We no longer have a complete program but one designed to only teach weighted test IGO. This seems to lead us to only one alternative.
In reality it is the IGOs and the emphasis to base curriculum on them - rather than the test itself - which seems to be the major cause of resentment related to the Stanford Test.
The IGOs - West Virginia Board of Education Policy 2520, Instructional Goals and Objectives for West Virginia Schools - define the instructional goals and objectives for the programs of study and establishes a standardized format for such in West Virginia public schools. The effective dates of the current policy are given as July 1, 1997; the most recent revision was effective February 27, 1998. The IGOs are the curriculum standards which educators are asked to meet. For example, the Mathematics IGOs for Grade 8 consist of 60 separate IGOs for Grade 8 mathematics, 32 of which have been designated (in bold face) as the emphasized or tested IGOs. Three of the IGOs are listed here for further example. The numbers in parenthesis indicate the grade levels in which these particular IGOs are to be addressed.
8.11 (5,6,7) add, subtract, multiply, or divide fractions, mixed numbers, and integers resulting from problem situations using mental math, paper/pencil, and calculators
8.12 develop computational strategies based on the commutative, associative, and identity properties with emphasis on the inverse and distributive properties
8.13 (9,10,11) solve traditional and non- routine problems, which may include missing information, using appropriate tools
All the IGOs for all the grade levels may be found at the West Virginia Department of Education (WVDE) website at http://wvde.state.wv.us
I. Why do some teachers object to teaching the IGOs?
Either ...
1) The teacher objects to externally imposed standards of instruction in the classroom, taking
the position that the teacher should be the sole arbiter of the subject matter that will be covered
in his/her classroom. Or...
2) The teacher is willing to accept that there must be standards concerning the curriculum, but
that this particular set of IGOs is flawed.
I-1) Should the teacher be the sole arbiter of classroom content?
Some teachers express the opinion that the teacher should use his/her individual judgement to determine what should be taught in the classroom - that rather than conforming to a set of externally imposed standards, the teacher should set his/her own educational objectives. For example:
1-16 My own curriculum goals cannot receive adequate emphasis when I must align
instruction with Stanford
assessment IGO's. Students who learn just the tested IGO's receive an inadequate education.
1-33 I feel like (the central office) is telling me to teach the test instead of what my students
really need. As a teacher, isn't that my experience or responsibility? I don't think that counts
for anything any more!
4-9 "Extreme emphasis" on the SAT scores has taken from me the freedom to use my
professional judgements about appropriate curriculum. Decisions about what learners
should know are being made by the
authors of this test instrument. Someone has decided what the individuals I work with,
observe, assess, and diagnose should know and at what time.
My mission as a teacher is to foster intellectual growth and development and to create
life-long learners. I need freedom to do this. I expect to use my judgement.
It is not that this test foists upon us a curriculum that is unacceptable, but it does decide for
me what is most important. There are value judgements with which I may or may not agree. I
think teachers should be able to
teach to the "teachable moment" we encounter. We need to be free to take curriculum side
trips when the intellectual curiosity of our students dictates that the time is right. We need to
be free to take more time in some areas when a group appears to need it.
The imposition of academic standards is a national as well as a local issue. Ravitch notes that the decade of the 1970s and into the 1980s ...
Ravitch goes on to point out that...
There are any number of reasons to oppose standards, and over the past 15 years, all of them
have been
expressed. We hear from the left that standards are dangerous because they interfere with the
absolute freedom of the teacher to teach whatever he or she wants; or that they pose the danger
of "official knowledge," or that the emphasis on subject matter impairs schools that prefer to
meet the emotional needs of
their students. On and on. From the right comes the potent opposition of those who fear that
standards will allow the federal or state government to impose propaganda or squishy,
values-laden "outcomes-based education" on every classroom (Ravitch, 1996).
No further to attempt to justify the need for academic standards in our public schools will be made here, except to agree with Ravitch that "The actual practice of setting standards is now recognized by virtually everyone as a function legitimately lodged with the states."(Ravitch 1996).
I-2) Are the current IGOs flawed?
Many Preston County educators acknowledge that standards are needed, but object to this particular set of standards - the West Virginia State Board of Education Instructional Goals and Objectives, the IGOs - as the standard to which we should align our curriculum.
The "flawed IGO" objection generally takes one of several forms.
a) The IGOs were specifically written to match the content of the Stanford 9 test. Rather than defining a good and efficient curriculum for a given grade level, the IGOs were designed to match areas tested on the Stanford 9 - "A faulty curricula based on a 40 question test."
4-54 We have been told to stress the concepts that have the most tested items. Many times these are not the concepts that are the most relevant for our students. I feel pressured to cover all the IGOs that are tested, and feel that we teach to the test rather than to mastery of necessary concepts.
4-34 ... Furthermore, many important subjects/topics are not tested and therefore, it is not to be taught. Huge gaps are being created in our students' education.
4-29 The reading test is way above first grade reading level, for the majority of first grade readers. Language test is also.
2-2 Teaching Kindergarten students how to fill in small circles. Teaching skills not developmentally appropriate for 5/6 year olds.
3-1 Implementing IGOs even if students are not at a level to understand the concepts.
a) Were the IGOs designed specifically and narrowly written to lead to success on the Stanford 9 test?
b) Do the IGOs outlined in West Virginia State Board of Education Policy 2520 constitute a rational standard on which to base public education?
I-2-a) Were the IGOs designed specifically and narrowly designed to lead to high scores on
the Stanford 9 test?
For example, the IGOs in Science provide specificity and detail to the West Virginia Science
Curriculum Framework from 1992. The IGOs are derived from the National Science Education
Standards and the American Association for the Advancement of Science Project 2061
Benchmarks. Each member of the science committee had copies of these materials. In addition,
the IGOs were also correlated to the Stanford -9 Achievement Test, the National Assessment of
Educational Progress, and the American College Test (ACT). The committee was comprised of
over 30 K-16 exemplary educators with documented extensive participation
in professional development activities that has enhanced their knowledge of trends in science
education, current thought, and strategies for increasing student achievement in science. Each
of the eight regions of the state were represented. (Personal communication, March 31, 2000)
4-48 Every meeting, most communications, textbook choices, etc. are centered on test scores.
The test runs the curriculum. It's as if the only thing that matters is doing well on the test.
This question echos the old conundrum "Which came first - the chicken or the egg?" "Does the
test drive the curriculum, or does the curriculum drive the test?"
3-27 ... Now I match my activities to IGO's instead of IGO's to activities.
How were the IGOs developed? Were they in fact taken from the latest editions of the Stanford
9 examinations? This question was posed to Mr. William Luff, Associate Superintendent, West
Virginia Department of Education. Mr. Luff notes...
The current IGOs are the most recent version of the state curriculum that was developed in
response to the Recht decision in 1982. Originally called Learner Outcomes, the name was
changed to a more accurate term, Instructional Goals and Objectives (IGOs) in 1989. In every
instance, the curriculum has been developed by teams of exemplary classroom teachers,
supervisors and professors from the state's colleges and universities. The process is a lengthy
one, beginning two years prior to the adoption of instructional materials by the state,
and is designed to ensure that the IGOs are reflective of the best scholarship and best
practices in that curricular area. With the growth of the national standards movement, the
state curriculum was further strengthened as more resources were available to the state teams.
In other words, the West Virginia State Board of Education IGOs were developed by leading
educators in our state.
In reality, the question "Whence cometh the IGOs" can be viewed as moot if we recognize that the IGOs should be judged on their own merit. Either they are good or they are bad. West Virginia State Board of Education Policy 2520 either constitutes a sound basis for our curriculum or it doesn't. If the IGOs are sound, then it makes no difference whether they were formulated after years of extensive study by national experts, drawn up by the faculty senate, or even, yes, taken from the current edition of the Stanford 9. The same observation may be made if they are judged to be unsound. Either they are good or bad - how they were derived is not the issue.
I-2-b) Do the IGOs outlined in West Virginia State Board of Education Policy 2520 constitute a rational standard on which to base public education?
Standards such as the West Virginia State Board of Education IGOs are of necessity formulated
by committee. It is to be expected that individuals may agree with some of the standards and
disagree with others. Cizek has
noted that...
"In the end, all standard-setting is judgmental, requiring consensus about what content is
worthy of pursuit, as well as decisions about what level or levels of performance should be
expected. These expectations, in turn, are influenced by notions of where students currently
stand, aspirations for future levels of performance, cognitive and developmental constraints, and
information from other relevant sources, such as international comparisons" (Cizek, 1998).
That said, is there any peer review data which might shed light on questions regarding the
suitability of the West Virginia State Board of Education IGOs?
Education standards in all 50 states have recently been reviewed by the American Federation of Teachers (AFT), published in their report "Making Standards Matter 1999." The authors of this report note that...
1. Standards must define in every grade, or for selected clusters of grades, the common content and skills students should learn in each of the core subjects.
2. Standards must be detailed, explicit, and firmly rooted in the content of the subject area to lead to a common core curriculum.
3. For each of the four core curriculum areas, particular content must be present.
4. Standards must provide attention to both content and skills.
Their question, "Are the standards clear, specific, and grounded in content," was answered either yes or no for each state for each of the 4 core subject areas for each of the 3 levels - elementary, middle, and high school. This resulted in 12 separate determinations for each state. For a state to be judged as having quality standards overall, at least nine of the 12 determinations must have been judged to be clear and specific and include the necessary content. (American Federation of Teachers, B, 2000).
West Virginia's IGOs earned a quality rating from this group, adjudged as having met criteria in 10 of 12 determinations. The two areas which were judged as not having met criteria were elementary and high school social studies, noting "vague U.S. and world history standards." (American Federation of Teachers, C, 2000). Of interest, one of the conclusions drawn from the entire study is that "most states have more difficulty setting clear and specific standards in English and social studies than in math and science," and that "social studies standards are particularly weak across the states; these standards tend to lack specific references to U.S. and/or world history. Only six states have social studies standards that are clear, specific, and grounded in content across all three levels of schooling." The authors go on to speculate that "the overall weakness of the social studies and English standards may be due to the controversy surrounding efforts to develop national standards in these subjects by the subject-area professional associations." (American Federation of Teachers, D, 2000). Of final note, only 2 states earned a "perfect 12" by this group - Arizona and California. Six states scored 11 and West Virginia was one of 7 states scoring a 10. One might conclude that West Virginia's IGOs, as a document describing academic standards in schools, was ranked among the top 15 in the United States. (American Federation of Teachers, E, 2000).
In another independent analysis of state standards - "Quality Counts 99" - the AFT's assessments of state standards were expanded upon by considering each state's assessment and accountability process. West Virginia's public education system was awarded an "A-" in the area of "academic standards, assessments, and accountability," and was ranked 6th in the nation in this regard. (Education Week, 1999).
Finally, the quality of states' academic standards has also been evaluated in studies sponsored by The Thomas B. Fordham Foundation and released as "The State of State Standards 2000." (Finn, 2000). This group evaluated the academic standards of the various states in five "core subject" areas of English, History, Geography, Mathematics, and Science according to published criteria. This group's assessments are summarized in the following table.
| STATE | ENGLISH | HISTORY | GEOGRAPHY | MATH | SCIENCE | CUM. GPA | GRADE | RANK |
| WV | B | C | B | B | F | 2.2 | C+ | 14 |
| USA | C- | D+ | C- | C | C | 1.72 | C- | . |
| RANK | . | 7th among states. Among top 8 in nation | 6th among states. | 6th among states | 33rd among states. 27 states received a D or better | . | . | . |
As the data above indicates, West Virginia IGOs were rated among the top ten in the areas of history, geography and mathematics; well above average in English (no rank given in this particular section) and unsatisfactory in science. Overall, West Virginia's IGOs were ranked above the standards used in other states, in 14th place overall.
These studies and assessments represent the judgements of well-credentialed independent educators from across the nation. It is stipulated that these assessments represent their opinions, and opinions are always subject to question. For example, it should be noted that the science curriculum assessment offered by the Fordham report is credited to only one person (assessments in other areas were performed by a panel of experts) and thus could be shaded by one person's singular preferences. (For example, the science IGOs were soundly criticized, among other reasons, because of the "bolding" of selected IGOs implying that other objectives were less important. Interestingly, other evaluators in other subjects praised the identification of emphasized IGOs in that it provides clear and concise guidance to the teacher). Furthermore, it may be unclear if the various "evaluators" were basing their assessments on the latest documents.
It is stipulated that any such set of Instructional Goals from any state should go through a continual process of review, refinement, and improvement. It may be noted that the current West Virginia State Board of Education IGOs do go through a continual process of review and refinement, and even at the time of this writing the K-3 English Language Arts IGOs are under review for proposed revision. Nevertheless, it is difficult to accept the proposition that the West Virginia State Board of Education IGOs are without merit - as many would have us believe. The available evidence indicates that West Virginia's IGOs were formulated by leading educators in our state and have been the subject of detailed analysis by other independent experts. The West Virginia IGOs have been found to be at least suitable if not among the best in the nation for use as a set of standards on which to base our public education.
II. Is the emphasis placed on standardized testing in Preston County schools inappropriate?
In order to address this question, one must ask two questions:
1) What is the purpose of standardized testing in West Virginia public schools?
2) Are the stated purposes for administrating the test valid?
II-1) What is the purpose of standardized testing in West Virginia public schools?
On the surface, the reasons for giving tests to school students would seem to be self evident. Most would agree that testing is required to determine if a student has mastered the subject matter and assign a grade to the student's performance. Cizek has noted that standardized achievement tests are given in America for a variety of reasons, running the gamut from a simple classroom test to large scale achievement tests such as the National Assessment of Educational Progress (NAEP) designed to assess the country's overall educational health. "Near the middle of the continuum are state-level competency tests used by many states as gatekeepers' for grade-to-grade promotion or graduation" and/or used (or mandated) by parents and policy makers "as markers in efforts to improve the education of American children."(Cizek, 1998).
Why are standardized tests administered in West Virginia public schools? The following tables summarize the implications of standardized test results in West Virginia. This data was gleaned from the West Virginia Board of Education Training Manual and Handbook for Education Performance Audits (Accreditation Manual), West Virginia Board of Education, Policy 2510, and Chapter 18A-3A-2B of the West Virginia State Code.
| Implications of Standardized Test Scores in West Virginia Public Schools | ||||
| . | Identify seriously impaired schools | Identify students for skill improvement efforts | Graduation warranty | Graduation warranty |
| Test score | The total basic skills score for one or more grade levels in grades 3 through 11 is at or below the 30th percentile in the most recent year for which data are available and one of the two preceding years. | The student scores below the 50th percentile in the areas of reading, mathematics, and/or language arts at grade 8 or above.. | The student scores at the 50th percentile or greater at grade 11 in the areas of reading, mathematics, and language. | The student scores at the 70th percentile or greater at grade 11 in the areas of reading, mathematics, and language. |
| Results in... | The school shall be considered "seriously impaired" | Student is placed in a skills improvement program. | Upon graduation, the student is issued a "warranty" indicating competency in basic skills. | Upon graduation, the student is issued a "warranty" indicating competency for advanced work place positions and entry into post-secondary education. |
| Immediate effect | The West Virginia Board of Education shall appoint a team of improvement consultants to make recommendations within sixty days of appointment for correcting the impairment. Principal required to attend the next Principals Academy.* | Written curriculum must be designed to implement the skills improvement program which must concentrate on improving deficiencies. | The warranty indicates that the graduate has mastered the basic skills of reading, mathematics, and language at a level appropriate for an entry level position in the workplace | The warranty indicates that the graduate has mastered the basic skills of reading, mathematics, and language at a level appropriate for advanced work place positions and entry into post-secondary education. |
| Possible ultimate effect | If progress is not made in correcting impairments, the State Board of Education may ultimately "intervene in the operation of the school system to cause improvements to be made" which may include a variety of personnel actions up to and including "declaring that the office of the county superintendent is vacant." | Additional time and resources spent by teachers and school administrators to analyze student areas of weakness and design appropriate curriculum for reteaching. Student will be given opportunity to bring basic skills scores to the 50th percentile. | If the student does not function successfully, the graduating school system will provide additional instruction in the basic skills at no cost to the student, employer, or post secondary institution. Warranty in effect for five years. | If the student does not function successfully, the graduating school system will provide additional instruction in the basic skills at no cost to the student, employer, or post secondary institution. Warranty in effect for five years. |
| Reference | WV BOE Accreditation Manual: 8.1, 8.5.1, 8.5.2, 8.5.3, 10.6.1, 10.6.2, 10.6.3 | WV BOE Accreditation Manual: 5.6.22 WV BOE Policy 2510: 8.2.8 | WV BOE Accreditation Manual: 5.6.23, WV BOE Policy 2510: 5.49, 8.2.7, | WV BOE Accreditation Manual: 5.6.23, WV BOE Policy 2510: 5.49, 8.2.9, |
| . | A factor in school accreditation | A stated objective | A stated objective |
| Test score | A minimum of 50% of the school's students in grades 3 through 11 perform at or above the 3rd quartile in total basic skills; and no more than 15% of the students perform within the 1st quartile; or the percentage of students performing within the 1st quartile is decreased based on two of the most recent three years | The percentage of graduates attaining the 50th percentile in reading, mathematics, and language is at or above 60%. | The percentage of graduates attaining the 70th percentile in reading, mathematics, and language is at or above 33%. |
| Results in... | The school should address the area(s) in the Unified School Improvement Plan or equivalent strategic plan. | No particular consequences apparent at this time. | No particular consequences apparent at this time. |
| Immediate effect | Written curriculum must be designed to implement the skills improvement program which must concentrate on improving deficiencies. Principal may be required to attend the next Principals Academy. * | Applies to students entering the 9th grade in the fall of 1998. | Applies to students entering the 9th grade in the fall of 1998. |
| Possible ultimate effect | Additional time and resources spent by teachers and school administrators to analyze student areas of weakness and design appropriate curriculum for reteaching. Could lead (along with other factors) to less than full accreditation for the school if not corrected; could lead to "seriously impaired status." | . | . |
| Reference | WV BOE Accreditation Manual: 4.1 | WV BOE Accreditation Manual: 4.12 | WV BOE Accreditation Manual: 4.12 |
* Principals are required to attend the Principals Academy every four years regardless of standardized test performance.
It may be concluded that in West Virginia public schools, standardized test results are used ...
a) To serve as one factor in the overall accreditation of a school (and school system).
b) To identify seriously impaired schools (and school systems).
c) To assess a student's mastery of the IGOs.
d) To identify students with academic weaknesses in basic skills - reading, mathematics, and language arts - and to provide specific information to teachers concerning those areas in which the student needs additional instruction.
e) To identify students who will be placed in a skills improvement program.
f) To serve as the basis for the school system to certify its graduates as proficient in basic skills at two levels - the graduation "warranty."
Of note, West Virginia does not require a threshold performance on a standardized test for a student to advance a grade level or graduate as do some states (Texas and North Carolina for example). Additionally, there is no indication that standardized test results are used in the individual classroom teacher's evaluation. West Virginia Board of Education policy 5310 - Performance Evaluation of School Personnel - makes no mention of standardized test results as figuring in to the teacher's performance evaluation (in fact, the word "test" does not appear in the policy).
2) Are the stated purposes for administering a standardized test valid? One might suppose that there would be little debate concerning the validity of at least some of the issues which standardized testing in West Virginia public schools is intended to address. However, such is not the case. Some teachers object to the use of standardized test scores to draw any conclusions about the school's performance. For example...
5-6 ...We should not judge our teachers by a group of students who work to do poorly on a
test because they don't like the system. This makes teachers feel used and abused unfairly. We
have enough hassle in the class
with new requirements nearly everyday. Forgive us if we don't shine on a test we did not take
and move on.
5-21 A school made - teacher made test. Collaborated by grade level teachers.
5-32 Why would the SAT 9 gauge the performance of the schools. It is the students'
performance it is meant to evaluate. Conferences, daily work and report cards, class pretests,
and post tests are valuable tools of performance as the student is compared to their own ability
and not a school's ability.
3-2 Being pressured/required to document the teach/reteach stages of IGO's is unreasonable.
4-30 I am resentful of having to be responsible for a lower quartile's performance when it's a
natural law that there will always be a lower quartile. Also, no steps are taken to make sure
students do their best so as not to mis-represent what they have been offered. Supposedly, they
will be in remedial classes next year if their scores are low, thereby, one more time, saving the
lower quartile. I think the schools should realize that there are those people who cannot be
helped.
4-24 ...I can not get my required IGO's taught if I must go back over basic skills in a
senior/junior class which
should have been learned somewhere else - years earlier.
5-19 ... I do not believe that Stanford test or any test taken by students can or should be used
to assess school or teacher performance. A test such as that can only be used to measure
student performance, not teacher performance....I am not on trial when my students take the
Stanford test, but that is how I am made to feel. The message coming down from the top is quite
clear to most of us teachers: You will be judged by how well
your students do on the Stanford test. That is blatantly unfair and unreasonable. My success
as a teacher is not dependent upon that or any other standardized test. If you are looking for
a tool that will assess how children have retained and can apply what they have been taught,
then use a test. But if you want to assess how well teachers teach, then find another tool..... I
resent the accusations that if students don't do well on the Stanford test, then it must mean the
teachers didn't teach well enough... We do all that we can for our students, but ultimately, their
performance on a test is out of our hands.... I have to say that I never feel supported or
appreciated by the board or central office administrators, and the times that I feel the least
appreciated and supported are the times when the Stanford test is brought up.
Some may object to the very concept of attempting to "re-teach" students with documented
deficiencies in basic skills. For example...
2-28 Working for an hour on the last year's test weakness - their attention span is too short for
this and this takes major time from everyday academics.
These objections aside, it is difficult to justify the position that the individual school, the county
school system, or the West Virginia public education system should be free from any attempt to
evaluate effectiveness. In that schools exist to educate students, then it must follow that schools
should be judged at least in part if not exclusively on how well they educate students. School
performance must be tied to student performance.
The degree to which schools and teachers should focus on re-teaching students who are academically deficient may be debatable. What is "re-teaching" if it is not "teaching a student something that he/she doesn't know or hasn't mastered?" If we object to teaching a student something he/she doesn't know, then the whole idea of education would seem pointless. And how can we assess what a student does or doesn't know without some objective test or measurement?
III. If the issues addressed by standardized testing are valid, is the SAT 9 an appropriate tool with which to address those issues?
If we may stipulate that it is necessary to assess school and individual student performance, and that some mechanism should be in place to provide objective identification of deficiencies, and that there should be an attempt to correct identified deficiencies, then the question presents: Is the Stanford 9 an appropriate tool with which to address those issues?
Several teachers have expressed the opinion that the SAT 9 in fact is not an appropriate tool with which to assess the mastery of IGOs by Preston County students. For example...
4-8 How can students in rural school settings where paper and pencils have become luxuries be compared to students in large city, upper socio-economic groups where educational experiences are unlimited?
5-29 Criterion referenced testing done at the beginning of the school year and the end. Looking at performance throughout the year - not just on one test. Some people do not test well and others are lucky guessers. The standardized tests seem to be biased against rural students.
5-7 I don't object to standardized tests, but I do object to comparing WV students to some in other states. We are "Appalachians." Some students aren't "well read" about some customs, dialects of other regions. Some things that are on the test haven't been covered or aren't until after the test is given. How is this fair to the students?
5-26 ...I don't completely object to standardized tests. However, Stanford is in California. Californians don't know an awful lot about West Virginians. Isn't there a WV achievement test? Hasn't anyone been able to develop one through Marshall University, West Virginia University - or both? That might give a more accurate picture.
4-2 Giving a test in which students are destined to fail because of the level of test - far above the majority of the students' levels.
4-28 ... In addition, in my curricular field the IGO's are written in such a broad vague way that it would be impossible in many areas to insure I was teaching the specific material tested. Having seen the test on several grade levels many questions really fall outside of the curricular alignment designated by the state. Therefore, we are often forced to find ways to bring in extraneous unrelated data just to cover the IGO created only because the item is on the test.
4-36 The test is not designed to show mastery. It is designed to force students into a bell curve. The test is designed for failure.
4-20 That the emphasis is only on teachers with student and family lacking any accountability. It is a standardized test with a norm reference - you will always have some below the norm or they will either change the test or re-norm the results.
4-21 I believe there is always going to be an "average," "above average," and "below average" students. The test scoring dictates this.
4-42 Not all students should be expected to take the test under same conditions. If a students needs to be able to read his test orally to himself to be able to comprehend that should be allowed. I do not like the several isolated reading passages the students need to read at one setting. Students scoring below the 50% somehow need to be measured/recognized for progress made each year. The test cannot be used to reflect how I have met the students needs.
5- 4 I feel that in order to have valid test scores across the state (or nation, for that matter) a different test should be given every year. The first year the Stanford Achievement Test was given was probably the only year that gave an accurate analysis of student performance. Since then, the teachers whose scores have increased drastically are probably the same teachers who have done the best job of teaching the test. I personally would not feel comfortable if my students' average scores were in the 70th or 80th percentile. These results seem excessively high and unrealistic. In closing, if you really want to know if teachers are offering a solid, sound curriculum, then change the test yearly and do not give teachers the opportunity to teach the test!
5-11 I do not object to standardized testing as an instrument to aid in knowing weaknesses and strengths. I do object to 4-5 days of testing, not always under the best conditions, carrying so much weighty in student performance.
4-16 It is not fair to compare schools scores when not all schools have the same programs and support.
4-18 We are constantly compared to other schools based on test scores. No consideration is given to student population.
4-23 There are numerous ways by which schools can be "graded," but in Preston Co., all teachers and students hear is Stanford, Stanford, Stanford.
4-26 I feel that there is a lot of pressure put on us for our students to perform well.
4-27 I feel pressure that if my students don't do well then I'm a bad teacher. Unless test questions are extremely easy or questions are being taught, the majority of our kids should be scoring around the 50th percentile. Standardized tests are meant to have most kids at 50th percentile.
4-35 We have been told discretely that if test scores didn't improve, we would be "taken over" by state officials, our principal might loose his job, we would be required to do even more alignment. Threats and innuendos.
4-39 I object to the fact that only those teachers with high scores are recognized. Pamphlets were made praising these teachers. Is a teacher solely recognized as a good teacher based upon test scores?
4-44 ...At some future point, will teacher evaluations be based upon how well their students have performed? When the pressure to produce goes up, the temptation to win at whatever the cost also rises, which would have a multitude of negative consequences for our education system.
4-51 The very first statement on my evaluation for 98-99 was where my students scored on the SAT. One administrator said we would be held responsible for our students scores being raised.
4-53 The entire staff has been threatened with improvement plans.
4-2 ...we never receive credit/acknowledgment /praise for the current accomplishments. Emphasis is given exclusively to the lowest SAT-9 quartiles.
4-25 We are being compared, unfairly, to other schools/states etc. You are made to feel terrible if your students perform poorly. You may teach well but students have a bad test day or weather causes distress, etc...
1-26 Individual and group analysis of SAT-9 test. All tested IGO's had to be correlated to curriculum and outside resources. This process took untold hours, because I teach all subjects. I began working on this in August / finished in October.
2-28 Working for an hour on the last year's test weakness - their attention span is too short for this and this takes major time from everyday academics.
3-2 Being pressured/required to document the teach/reteach stages of IGO's is unreasonable.
1) The SAT 9 examination and the manner in which it is administered is technically flawed.
Examples given include...
III-1) Is The SAT 9 technically flawed?
Any test, whether it is a nationally normed "standardized test" or the weekly classroom math quiz, may contain unfair or ambiguous questions, may include items which are "too difficult," or may be administered under suboptimal conditions. Any test, whether devised locally or at the state level, whether it is criterion referenced or norm referenced (terms discussed later in this report), or whether it is select-response (e.g., multiple-choice, matching, true/false) or constructed-response (e.g., essay, short-answer, speech, project) will be subject to these same criticisms. These are valid concerns; however, it is doubtful that any test could be devised which would be free from such problems. Standardized tests such as the SAT 9, CTBS, etc, are reported to have been extensively analyzed for such errors and claims are made by the publishers that such errors are minimal. Whether a test could be devised at the local or state level which would be free of such errors is open to speculation. Were the SAT 9 scrapped in favor of another assessment tool, one might predict that these same criticisms might persist.
Problems with infrequent "norming" of the test and resultant score inflation ("Lake Wobegon effect") have been well described (Cizek, 1998). In addition, reports abound which suggest that many well known national "standardized tests" may be biased against a particular racial, gender, or socioeconomic group. The Texas Assessment of Academic Skills (TASS), a criterion-referenced "high stakes" test ("high stakes" in that satisfactory performance on this exam is required to advance to the next grade level or graduate) has been the subject of two separate lawsuits based on racial discrimination (Phelps, 1999). The TASS case is interesting in that it is a test locally developed in Texas and specifically written to assess Texas instructional goals. One may speculate that even if there were a locally developed "West Virginia test" such "discrimination" charges might still surface.
Choosing a "norm group" which is not representative of the tested group is a significant issue. An example of the difficulties which may arise in interpreting scores from norm referenced tests is illustrated by examining Preston County standardized scores from the past 22 years (Appendix One). Preston County scores may be compared with West Virginia's average county score for each year for grades 3, 6, 9, and 11. Analyzed in this fashion, the "reference group" for a given year is not that particular test edition's "national norm group," but rather the performance of students in other counties in West Virginia. This would seem to control for socioeconomic differences between the tested group and the reference group. This approach would also seem to control for infrequent or unexpected "re-norming" of the test. The change from the CTBS test to the SAT 9 which occurred in 1997 is controlled for, in that all counties made the change the same year, and analysis is based on performance compared to other counties in the state - rather than the specific test's "national reference group."
When Preston County's total basic skills scores are analyzed in this fashion, the following picture develops: Third grade basic skills scores have been below the state average every year since 1981, but have been over the 50th "national percentile" every year during the same interval (with the exception of the 1989-90 year when no testing was done). Similarly, sixth grade basic skills scores have been below the state average every year since 1979, but above the "50th national percentile" every year except for 1978. Eleventh grade basic skills scores have only reached the state average one year (1994) but have been above the 50th percentile nationally 7 out of the last 22 years, and 6 out of the last 7 years.
Analysis of test scores in this fashion demonstrates several caveats in interpreting test scores. For instance, one might be heartened to find that Preston County's 3rd grade basic battery score in 1999 was at the 56th percentile compared to the "national norm reference group." However, the state average that year for the same group was the 63rd percentile. While Preston County scores could be said to be "above the national average," those same scores were well below the state average. The average scores for West Virginia counties at all grade levels have been above the 50th percentile compared to the "national norm reference group" every year since 1987 - The "Lake Wobegon effect - where all students are above average." Of further note, Preston County's third grade score in basic battery apparently "improved" from the 52nd percentile in 1997 to the 55th percentile in 1998. While this improvement is notable, equally notable is that the state average score for the same group improved from 58 to 62 in the same interval. One could say that Preston County's third grade scores improved three points in 1998, or one could say with equal authority that our third grade students fell behind their West Virginia peers by one point the same year.
Comparing our student performance only to those in other counties in West Virginia gives only an incomplete picture, as will be discussed later in this report. Nevertheless, such technical objections to any given test regarding its format, difficulty, reliability, and validity will continue to be raised. Such objections may be easy to raise but difficult to remedy.
II-2) Is the SAT 9 properly keyed to the West Virginia State Board of Education IGOs?
If we were to take the assertions of many of our teachers at face value, such as...
3-22 All my lesson plans are correlated with tested IGO's. I do not teach any curriculum not tested on SAT- 9.
4-28 ...However, with the new IGO's it is evident that they were developed to teach primarily what appears on the test in a given year without regard to the systematic or linear progression of the material...
4-48 Every meeting, most communications, textbook choices, etc. are centered on test scores. The test runs the curriculum. It's as if the only thing that matters is doing well on the test.
"How can it be that an off-the-shelf' test such as the SAT 9 can prove to be keyed to the West Virginia State Board of Education IGOs?" This question was posed to Mr. William Luff, Associate Superintendent, West Virginia Department of Education. Mr Luff notes:
While the current assessment program is not perfect, it has been affordable. Most importantly, its effectiveness in supporting instruction is confirmed by improved scores by students in WV on every independent measure of student achievement--the National Assessment of Educational Progress (NAEP), and college bound students' ACT scores--the highest in the history of the state despite the fact that approximately 60% of all seniors now take the test. (Personal communication, March 31, 2000).
Furthermore, the degree to which the assessment tool (the test) achieves critical importance is relative only to the implications which attend success or failure on the test. For example, Texas public school students must pass the Texas graduation test to receive a diploma. In this case the mechanics of the test, including how well it reflects the curriculum offered the student, achieve critical importance since graduation is directly tied to performance on the test. (No doubt explaining why such "high stakes" tests are often the subject of litigation). In West Virginia, the implications of success or failure on the SAT 9 do not approach this level of significance. For the most part, poor performance on this test only means that curriculum must be adjusted for the school and/or for the student. However, in that the curriculum is based on demonstrably valid IGOs, how can curriculum adjustments based on valid IGOs possibly be viewed as objectionable? The rhetorical question arises: "How is it a bad thing to teach a student something he/she doesn't know?"
The debate may continue concerning the degree to which the SAT 9 is accurately keyed to the West Virginia State Board of Education IGOs. However, most would agree that it is the Instructional Goals - the standards - which define our attempts to educate our children and which are of primary importance.
III-3) Are SAT 9 results an appropriate factor to consider in school accreditation and to identify seriously impaired schools?
One might concede that as far as standardized tests go the SAT 9 is in and of itself not a bad test. However, one might correctly raise objections to its use for specific purposes - "A good tool but used for the wrong purpose." The objections addressed above concerning tests in general - technical flaws and proper keying to the IGOs - may apply to any type of test. In order to further analyze the suitability of a test such at the SAT 9 for a given objective, it is necessary to briefly review the various types of standardized tests, note several definitions and distinctions among the tests, and examine purposes for which various types of tests might be used.
Norm-referenced tests (NRTs) are designed to describe relative rank among students at a particular grade level, providing information about how a student's performance compares with a reference group of students called the norm group. The Stanford 9 (and most "achievement tests") are examples of NRTs. For example the national standardization sample (norm group) for the current edition of Stanford 9 is based on spring and fall 1995 testing, with between 500,000 and 600,000 students participating (Harcourt, Stanford 9 Technical Information, 2000). A student who performs at the 50th percentile on a norm-referenced test may be said to have performed as well as or better than 50% of the students in the norm group who took the test. Other examples of NRTs include the Comprehensive Test of Basic Skills (CTBS), the California Achievement Test, and Terra Nova, all published by CTB/McGraw -Hill; the Metropolitan Achievement Test, published by Harcourt-Brace Educational Measurement (as is the Stanford 9), and the Iowa Tests of Basic Skills, published by Riverside Publishing. "Together, these tests substantially define large-scale, norm-referenced achievement testing in the United States. Nearly 60% of the state-mandated achievement tests used across the country are commercially published, with the achievement tests of these three major publishers accounting for 43% of all system-wide tests" (Cizek, 1998).
As an aside, a commonly heard observation concerning norm-referenced tests is that the nature of such tests dictates that there will be questions on the test which the test taker will not be expected to be able to answer; that, in order to allow for a wide dispersal of scores and a uniform "curve," there will be questions intentionally placed in such tests which are far above the expected skill or knowledge level of the test taker. Of note, Cizek's definition includes the proviso that norm-referenced tests "are constructed to cover content that is considered fairly universal at each grade level." (Cizek, 1998).The publishers of the Stanford 9 assert that "All items (in the current Stanford 9) are grade-level appropriate so that they are within the experience of students taking the test" (Harcourt, Stanford 9 Overview, 2000).
Criterion-referenced tests (CRTs) are intended to "gauge whether a student knows or can do specific things" (Cizek, 1998). They are based on content judged to be important in regards to the area being tested, and criteria for success are established in a judgmental fashion. An example of a criterion-referenced test would be the recertification exam given by the American Board of Surgery. The questions on this test are based on knowledge that a practicing surgeon is expected to have at his/her command. Success (recertification) or failure is based on how the individual scores on the test - 70% is pass, less than 70% is fail. Theoretically, everyone who takes the test could pass, or everyone could fail. An ordinary classroom test is an example of a criterion-referenced test. "High stakes" tests (see below) are generally criterion-referenced.
Standards-referenced tests (SRTs) are similar to criterion-referenced tests, with an additional attempt made to "link students' scores to concrete statements about what performance at the various levels means" (Cizek, 1998). Content standards are devised to represent "what the student should know" and performance standards are developed to describe "how well students need to be able to perform on a set of content standards in order to meet pre-defined specified levels of expected performance." An example of a standards referenced test is the National Assessment of Educational Progress (NAEP), which reports student's performance as Basic, Proficient, and Advanced.
"High Stakes" test - A test in which significant consequences are associated with performance on the test. A test to determine if a student will graduate or pass to the next grade would be a "high stakes" test. A test which one must pass in order to obtain licensure or qualify for a specific job or occupation would be a "high stakes" test.
"Low Stakes" test - A test in which serious consequences do not follow from the performance on that single test. A weekly classroom quiz which is averaged with other tests to arrive at a particular grade in a course would be a "low stakes" test. The NAEP (National Assessment of Educational Progress) is described as a "low stakes" test for the individual student in that individual student scores are not even reported.
|
Comparisons of three types of "standardized tests" From Cizek, 1998 |
|||
| . |
Norm-referenced tests (NRTs) |
Criterion-referenced tests (CRTs) |
Standards-referenced tests (SRTs) |
| Answers the question... | "Where does this student stand compared to others at his or her grade level?" | "Can the student demonstrate knowledge or skill to a specified level?" | "How would this student's performance be rated, according to pre-set standards?" |
| Scores are reported as... | Percentile rank | Generally, pass or fail | Terms such as "Beginning, Proficient, Expert," or "Good, Better, Best." A, B, C, D, F |
| Key point | A student's performance "does not necessarily indicate anything about the knowledge or skills a student has mastered, nor whether scoring at the reported percentile represents acceptable progress, nor whether instruction has been of sufficient quality, nor whether the content is sufficiently challenging or the outcomes measured desirable." | A student's performance "does not necessarily indicate anything abut whether the student is better or worse than average, nor whether the criteria represent noteworthy expectations given the student's age or grade level, nor whether the content is challenging or the outcomes measured desirable. | A student's performance "(does) not necessarily indicate anything about whether the student is better or worse than average, nor whether the criteria represent noteworthy expectations given the student's age or grade level, nor whether the content standards associated with the performance are particularly challenging. |
| Caveats and unanswered questions |
"Performing at grade level
means only that a student is
performing about as well as
the average performance of
the norm group; no
evaluation is made regarding
whether the norm group as a
whole is performing superbly
or terribly." "A student performing at grade level' on an NRT could be well-prepared for global competition or woefully lacking in even the most rudimentary areas." | Because the criteria (what the student is expected to know) are established in a subjective manner, results are linked to the expectations of those who establish the criteria and write the test. | "Because performance standards ... are established in a subjective manner, classifications such as "Proficient or Expert are inextricably linked to the conceptions of competence held by those who establish them. If those who set the standards have high expectations for performance, a classification such as "proficient" might mean magnificent accomplishment; if the standard-setters have low expectations, the same classification could represent mediocrity." |
The question at hand is, "Are SAT 9 results an appropriate factor to consider in school
accreditation and to identify seriously impaired schools?"
In theory, the answer might seem to be "no." A criterion-referenced test rather than a
norm
referenced test such as the SAT 9 might be advocated when testing is "high-stakes" (and for the
moment we shall stipulate that this accreditation process is "high-stakes"). All schools should
have the opportunity to be accredited based on their own merit and compliance with established
standards. There should be no stipulation that a certain percentage of schools should receive less
than full accreditation, and success should not depend on the relative performance of other
schools during a given assessment period. Under this "ideal" model, curriculum goals (IGOs) for
each grade level would be established and a test devised to assess student mastery of those
goals. Students would take the test and each student would either pass or fail based on his/her
own performance (percentage of questions answered correctly). Each student could pass, or
each student could fail. Average student scores for each grade level could be devised, and the
performance of each grade level and/or the school as a whole could be assessed and rated as
"satisfactory" or "unsatisfactory" based upon average student score and established performance
standards.
In reality however, the SAT 9 does seem to be an appropriate test for this purpose. Perhaps in an ideal world, each school would set its own curriculum goals and devise its own assessments, as was suggested.
If the state standards match the local school standards, then one assessment tool would seem to suffice. However, in our ideal world where the school establishes its own standards, a second testing program would be required to assess the school's performance according to state standards. This second test should be a criterion-referenced test developed by the West Virginia Department of Education to assess the student (and school) against state expectations. Every school could theoretically be accredited or not accredited. Unfortunately, the same problems would exist with this state assessment as with the county assessment - the state criterion-referenced assessment test would not necessarily indicate anything about whether the school and state's performance is better or worse than average, nor whether the instructional goals established by the state represent noteworthy expectations given the student's age or grade level, nor whether the content is challenging or the outcomes measured desirable. (Cizek 1998).
This scenario finally plays out when the student each year would need to take yet a third standardized test, such as the NAEP, to determine if the performance of the school, as defined by state standards, meets national standards. In summary, in this "ideal" world the student would take not one but three (or more) major standardized tests each year, each subject to the same criticisms which are currently leveled against the SAT 9.
One may correctly conclude that although this might represent the "ideal," the cost involved in terms of dollars and time spent on testing would make this approach prohibitive.
Cizek has noted that "public demands for accountability and legislative responses tied to testing have created the need for tests that serve many masters and purposes. Responding to pressures to address these diverse concerns, commercial test publishers have attempted to develop products that attempt to serve multiple purposes." (Cizek, 1998). In that the SAT 9 is a norm-referenced test, an additional use of the test does present. Information can be gleaned from the test regarding the student and school performance as it relates to other students and schools throughout the nation. Some measure of the rigor and suitability of West Virginia IGOs may be made in relation to educational standards which exist in other states. It is stipulated that "performing at grade level means only that a student is performing about as well as the average performance of the norm group (and that ) no evaluation is made regarding whether the norm group as a whole is performing superbly or terribly." (Cizek 1998). Nevertheless, the SAT 9 does provide school assessment information which not only can be correlated to statewide standards but also indicates performance relative to a nationally representative group.
It is stipulated that there is a theoretical disadvantage to the use of the SAT 9 for performance assessment, in that the SAT 9 is a norm-referenced test. As has been noted, in "high stakes" testing there should be no expectation that a certain number of those tested should fail. As an example, West Virginia State Board of Education accreditation standards dictate that for a given school grade level, performance below the 30th percentile in grades 3 through 11 in one or more grade levels in the most recent year for which data are available and one of the two preceding years mandates school classification as "seriously impaired." On the surface, one might conclude that 30% of schools in West Virginia would be relegated to "seriously impaired" status each year. However, this conclusion is faulty.
First, it is debatable whether the SAT 9 test is truly a "high stakes" test for the school. It is true that "seriously impaired" or other less than full accreditation status is attached to test results, but the next question is "So what?" What are the implications of less than full accreditation? The only immediate implication is that the school personnel must set about to write improvement plans and make curriculum adjustments. Investigators and "improvement teams" may appear on the scene. Principals and teachers may be directed to modify their instructional techniques. Teachers and principals must spend additional time preparing reports and making such curriculum improvements as directed. Principals may be required to attend the Principal's Academy a year or two sooner than they otherwise would have. Prides will be wounded. If improvement is not noted after months of intervention and curriculum adjustment, teachers and principals might be transferred or given improvement plans and/or other personnel changes could be made. However, rarely does the process get farther than the paperwork and curriculum alignment stage. A pronouncement of "seriously impaired" or "less than fully accredited" cannot singularly lead to termination of an employee, a reduction in pay, or have any other adverse consequence on the school personnel.
Secondly, the assertion that "thirty percent of the schools are relegated to seriously impaired status" is flawed. A school or county's grade level percentile rank as reported by SAT 9 results is relative to a national norm reference group. Thirty percent of grade level scores in West Virginia are not at the 30th or lower national percentile. As has been noted earlier, the average percentile rank for grade levels in West Virginia is well above the 50th national percentile. The 1999 average county grade level score on the SAT 9 was 61 (both mean and median) with a range of 58 to 65 for each of grades 3-11. The only way that these scores could be manipulated to result in 30% of schools failing to meet a certain criteria would be if all West Virginia school grade level scores were plotted and a second norm curve relative to West Virginia grade levels was derived. In this case, a school grade-level score of 30th percentile relative to the national norm might in reality be at the 5th or 10th percentile relative to state norms (if indeed any grade levels in West Virginia actually scored that low).
Those who condemn the use of a norm-referenced test in "high-stakes" testing must accept the alternative - a criterion-referenced test. This would require that schools and grade levels be assessed according to absolute performance on a criterion-referenced test devised by the West Virginia Department of Education. Every school could pass or every school could fail. Criterion-referenced tests, like norm-referenced tests, are based on content judged to be important in regards to the area being tested. However, with criterion-referenced tests, criteria for success are established in a judgmental fashion. Those criteria will be open to question, particularly if a significant number of test takers fail (or pass) the test. One may speculate that if such a change to criterion-referenced testing with minimum competency standards were made, the result would be simply trading one set of objections for another.
As a final note, Cizek has observed, that "because no single approach currently provides a complete picture of student achievement, those responsible for mandating, conducting, or interpreting the results of testing programs must demand as much standards-based and norm-referenced information as possible." (Cizek, 1998). One may ponder the following question: What happens if, during the course of the year, everyone in a particular class has a "failing" (below 70% performance) grade? Generally, the grades are "curved." In other words, the criteria for success are adjusted by applying some norm-referenced standard. In the final analysis, one might contend that all criterion-referenced tests are in reality norm-referenced tests, in that the criteria for success are calibrated by norm-referenced data.
In summary, we may conclude that schools and teachers are obligated to design curriculum around the IGOs as these represent the standards appropriately established by the state. The SAT 9 is keyed to those standards. The state may attach such criteria for "success" as it chooses. Less than satisfactory performance by the school in relation to the SAT 9 test results in exhortations to the school/teacher to give more attention to achieving established educational goals, but has no other immediate detrimental effect. It is debatable whether this even amounts to "high stakes" testing. SAT 9 results are an appropriate factor to consider in the school accreditation process and to identify seriously impaired schools.
III-4) Is the SAT 9 an appropriate tool with which to assess a student's mastery of IGOs?
Perhaps the real question might be "Is (or should) the SAT 9 (be) the only tool which should be used to assess a student's mastery of the IGOs? This might be a philosophical question, but in reality the answer is no - in West Virginia the SAT 9 is not the only tool used to assess the student's mastery of the IGOs. With the exception of the skills improvement program and the graduation warranty (discussed below), performance on the SAT 9 test seems to figure little into the overall assessment of the student's academic performance. Promotion to the next grade level is still based on performance standards set by the classroom teacher. Graduation is still dependant on the composite of those teacher chosen standards. Standardized test performance does not figure in to a student's eligibility for participation in extra-curricular activities as does the teacher-chosen performance standards.
Some students do poorly on the SAT 9 test, some do average, some do well. The next question is another "So what?" What are the implications of a student performing below or above an admittedly arbitrary standard on the SAT 9?
a) The student may be placed in a skills improvement program
b) Certain pronouncements will be attached to the student's graduation credentials - the graduation "warranty."
In order to further discuss this question, one must analyze these implications.
III-5) Is the SAT 9 an appropriate tool to identify students with deficiencies in basic skills - reading, mathematics, and language arts?
This is a difficult question to address. "Deficiencies" is entirely a subjective pronouncement. Whether a student is "deficient" or "proficient" depends on how the terms are defined and what judgmental standard is used. One educator may be of the opinion that a given student's skills are deficient, while another educator might feel that the same student's skill are quite satisfactory. This survey has uncovered opinions from several high school teachers who indicate that many students advance to high school deficient in the basic skills. Perhaps the student's elementary and middle school teachers might disagree. It all boils down to the following questions: "What do we expect a student to master at a given grade level, what performance standards shall we set, and what are the implications for a student who fails to meet those standards? How do we define 'deficient' and what are the implications of being identified as deficient?'"
Answering those questions is the intent behind educational standards (the IGOs) and the use of a uniform tool as one factor in assessing mastery of those standards. Performance standards are judgmental in nature, and it is the state which may set those standards. The validity of a given set of performance standards and the tool used to assess those standards is relative to the implications which attend success or failure in meeting those standards. For instance, one could set a standard: "All classroom teachers in West Virginia will pass minimum competency tests." If there are no particular implications attendant to that standard, then this standard might correctly be viewed as a minor annoyance. If it were to turn out that significant numbers of teachers were to be terminated next year because of failure to demonstrate proficiency, then, of course, the validity of the standard and the mechanics of the competency test come seriously into question.
Ultimately, the question comes down to another "So what?" Students who score below the 50th percentile in basic skills will be placed in a skill improvement program. So what?
III-6) Is the SAT 9 an appropriate tool to identify students who will be placed in a skills improvement program?
III-7) Is the SAT 9 an appropriate tool to provide specific information to teachers concerning those areas in which the student needs additional instruction?
In that the end result for the student is that he/she is being given additional instruction in areas of deficiency, then it would seem that there should be no objection to this use of the test for that purpose. The student is not being punished. No one should disagree with the concept of teaching a child something he/she doesn't know. Nevertheless, objections may be raised by parents and well as educators:
1) My child has made A's and B's all his life in English. Now you want to put him in remedial "bonehead" English. My child is not dumb. How dare you suggest my child is stupid!
We should identify skill improvement efforts for what they are - efforts to improve the student's skills in a particular area. We should avoid negative connotations which are often attached to such efforts, such as "remedial" or "bonehead" English. The choice of indicator used to place a student in a skills improvement program, whether it is performance on the SAT 9, overall performance in classroom work, recommendations made by a committee of educators, or some other process, is not the critical issue. As long as the student is being taught something he/she doesn't know, there should be no objection to this endeavor. Perhaps a public relations program to explain efforts directed to skill improvement is in order.
2) Why should the student be forced to waste his/her time covering IGOs relating to basic skills in English, math, and language arts when there are other subjects on which he/she could be spending time?
2-15 We have pull out classes that are only to help improve SAT scores. No other reason. That time could be on elective instead.
2-16 We have created an entire period of 30 mins a day to address deficiency of the Stanford. I have in my class students who do not need the remedial work and I lose this valuable time to teach content.
3-19 I have only changed my lesson plans to reflect the State's IGO's. I still plan my lessons to maximizes student performance, within my capabilities of time and resources.
4-24 Suggestions:
1. Cut out extra programs which are required in elementary schools. Teach the old basics:
Reading (practice, practice, practice); Arithmetic (practice, drill, practice); writing, spelling,
science, history, geography. Throw
in some art and music to round out the individual. Nothing else need be taught in elementary
school.
2. Hold students and parents accountable and responsible for their part in education.
3. Stop social promotions. I can only be accountable for what a student learns in my
classroom. I can only be
accountable for what a student learns in my classroom. I can not get my required IGO's
taught if I must go back over basic skills in a senior/junior class which should have been
learned somewhere else - years earlier.
4-32 ... The ninth grade students do not know the parts of a sentence. Before I can help them
to mature as readers and writers, I need to teach them the parts of a sentence. I will be
neglecting the usage exercises that they need in order to perform well on the test. On the other
hand, if I neglect the skills that they really need (parts of a sentence), then their language skill
development will suffer throughout high school.
3) The skill improvement is a waste of time for the student. He/she already knows the material being taught in the skill improvement class and this time could be better spent for something else. The SAT 9 has forced the student to spend time on curriculum he/she has already mastered.
a) The student performed quite satisfactorily on the pre-test and probably didn't need the re-teaching (flawed selection process)
b) The re-teaching process itself was flawed.
c) Despite our best efforts the student for whatever reason didn't learn anything.
If it is determined that students are being put into skill improvement classes which they don't need (e.g. they have already mastered the curriculum being taught in those classes), and assuming that the curriculum is based on IGOs identified by analysis of SAT 9 tests, then we may conclude that the SAT 9 is a flawed indicator for placement of students in those classes. If, however, we can conclude that the student did learn something in the skill improvement program, then it is successful regardless of the process by which the student was placed in the program. If the student did poorly on both the pre-test and the post-test, then either the re-teaching process itself was flawed or the student for whatever reason can't or doesn't want to learn.
4) My students do need skill improvement, but not in the areas indicated by the SAT 9. This test does not give useful information about where my students need help.
Are re-teaching efforts based on SAT 9 data beneficial? If not, where is the flaw? It could be that some other indicator should be used to identify the basic skills, if any, which should be "re-taught" to the student.
In summary, the current SAT 9 driven IGO re-teaching efforts are valid until and unless we determine that students who need no improvement in the basic skills are being placed in skill improvement programs. Otherwise, if it is demonstrated that students are learning something in skill improvement programs driven by SAT 9 testing and IGO re-teaching, then this program is appropriate and successful and the SAT 9 is an appropriate test to use in this regard.
III-8) Should SAT 9 results serve as the basis for the school system to certify its graduates as proficient in basic skills at two levels - the graduation "warranty?"
This issue was not raised by the survey but is listed here for completeness sake. No data is available on this program, and no further discussion of this question will be attempted here.
Summary and Conclusions
The practice of setting educational standards and instructional goals for public schools is recognized by virtually everyone as a function legitimately lodged with the state. The State of West Virginia has the constitutional mandate to assure a "thorough and efficient education" for its children. The available evidence indicates that the West Virginia State Board of Education Instructional Goals and Objectives were formulated by leading educators in our state and have been the subject of detailed analysis by other independent experts. West Virginia's Instructional Goals and Objectives have been found to be at least suitable if not among the best in the nation for use as a set of standards on which to base our public education. Schools and teachers are obligated to design curriculum around the West Virginia State Board of Education Instructional Goals and Objectives as these represent the standards appropriately established by the state.
The state may use such criteria for accreditation of its schools as it chooses. In West Virginia the Stanford 9 test is only one of several indicators used to assess school and individual student performance. The Stanford 9 is appropriately keyed to the West Virginia State Board of Education Instructional Goals and Objectives. Technical objections may be made to the Stanford 9, but these objections may be raised against any test and are neither specific to nor do they invalidate the SAT 9 as an appropriate testing instrument. Less than satisfactory performance by the school in relation to the SAT 9 test results in exhortations to the school/teacher to focus on achieving established educational goals, but has no other immediate detrimental effect. The Stanford 9 results are therefore an appropriate factor to consider in the school accreditation process and to identify seriously impaired schools.
The individual student who performs below the 50th percentile on the Stanford 9 in the areas of reading, mathematics, and/or language arts is placed in a skills improvement program in those basic skills areas. The utility of the Stanford 9 in this regard can only be assessed by determining if students placed in these improvement programs actually improve their basic skills. If it can be demonstrated that students are learning something in skill improvement programs driven by Stanford 9 testing and re-teaching of the instructional goals and objectives, then the Stanford 9 is being appropriately and successfully used to help the students learn. If students who need no improvement in the basic skills are inappropriately placed in skill improvement programs on the basis of their Stanford 9 performance, then the use of the Stanford 9 in this regard would seem inappropriate.
The data in Table One lists two numbers for each grade level for each year. The first number indicates how many of the 55 counties in West Virginia scored the same as or below Preston County for a given year in basic skills on that year's standardized test. The number in parentheses indicates how many points above or below the state average Preston County scored that year. For instance, in 1977 our third grade CTBS basic skills score was equal to or better than 44 of 55 counties, and was 5 points above the state average. The same year, our 11th grade CTBS basic skills score was the same as or better than 18 of 55 counties, and was 4 points below the state average.
The data in Table Two lists two numbers for each grade level for each year. The first number indicates Preston County's "national percentile rank" for a given year in basic skills on that year's standardized test. The number in parentheses indicates the West Virginia state average for the "national percentile rank" in basic skills. For instance, in 1977 our third grade CTBS basic skills score was at the 58th percentile nationally, while the state average was the 53rd percentile nationally. The same year, our 11th grade CTBS basic skills score was at the 40th percentile nationally, while the state average was the 44th percentile nationally.
The CTBS (Comprehensive Tests of Basic Skills) was used 1977- 1996; SAT-9 (Stanford Achievement Test, 9th Edition) has been in use beginning in 1997.
|
Table One Preston County Schools Standardized Test Scores: Number of counties scoring the same as or below Preston County in basic skills (Points above or below the state average in basic skills) |
Table Two Preston County Schools Standardized Test Scores: Preston County's national percentile rank score in basic skills (West Virginia average national percentile rank score in basic skills) | ||||||||
| Grade | 3 | 6 | 9 | 11 | Grade | 3 | 6 | 9 | 11 |
| 1977 | 44 (+5) | 25 (-3) | 19 (-6) | 18 (-4) | 1977 | 58 (53) | 47 (50) | 41 (47) | 40 (44) |
| 1978 | 43 (+4) | 30 (+1) | 27 (-1) | 23 (-4) | 1978 | 54 (50) | 50 (49) | 46 (47) | 40 (44) |
| 1979 | 32 (+2) | 27 (-2) | 17 (-5) | 21 (-4) | 1979 | 55 (53) | 49 (51) | 43 (48) | 41 (45) |
| 1980 | 40 (+4) | 9 (-11) | 25 (-3) | 16 (-4) | 1980 | 60 (56) | 43 (54) | 46 (49) | 41 (45) |
| 1981 | 24 (-2) | 15 (-6) | 26 (-2) | 11 (-8) | 1981 | 55 (57) | 49 (55) | 48 (50) | 38 (46) |
| 1982 | 25 (-2) | 14 (-7) | 24 (-3) | 16 (-6) | 1982 | 57 (59) | 51 (58) | 49 (52) | 41 (47) |
| 1983 | 10 (-8) | 14 (-7) | 25 (-5) | 25 (-4) | 1983 | 52 (60) | 51 (58) | 49 (54) | 45 (49) |
| 1984 | 11 (-6) | 15 (-6) | 13 (-7) | 16 (-5) | 1984 | 55 (61) | 53 (59) | 49 (56) | 44 (49) |
| 1985 | 20 (-3 ) | 13 (-6) | 21 (-5) | 20 (-7) | 1985 | 54 (57) | 49 (55) | 45 (50) | 47 (54) |
| 1986 | 10 (-8) | 14 (-6) | 20 (-7) | 8 (-11) | 1986 | 54 (62) | 54 (60) | 43 (50) | 44 (55) |
| 1987 | 21 (-3) | 21 (-2) | 17 (-7) | 6 (-15) | 1987 | 62 (65) | 60 (62) | 44 (51) | 38 (53) |
| 1988 | 4 (-11) | 16 (-6) | 18 (-3) | 8 (-10) | 1988 | 57 (68) | 56 (62) | 49 (52) | 48 (58) |
| 1989 | 17 (-3) | 20 (-2) | 11 (-7) | 8 (-10) | 1989 | 65 (68) | 60 (62) | 46 (53) | 47 (57) |
| 1990 | * | * | 15 (-3) | 17 (-4) | 1990 | * | * | 50 (53) | 54 (58) |
| 1991 | 17 (-3) | 9 (-6) | 5 (-11) | 2 (-14) | 1991 | 67 (70) | 58 (64) | 42 (53) | 44 (58) |
| 1992 | 23 (-2) | 18 (-3) | 26 (-1) | 13 (-7) | 1992 | 57 (59) | 50 (53) | 55 (56) | 48 (55) |
| 1993 | 9 (-6) | 25 (-3) | 18 (-3) | 9 (-8) | 1993 | 57 (63) | 55 (58) | 54 (57) | 48 (56) |
| 1994 | 3 (-11) | 6 (-9) | 43 (+5) | 30 (0) | 1994 | 54 (65) | 50 (59) | 62 (57) | 59 (59) |
| 1995 | 20 (-3) | 10 (-5) | 42 (+4) | 24 (-2) | 1995 | 63 (66) | 54 (59) | 63 (59) | 56 (58) |
| 1996 | 18 (-6) | 16 (-5) | 30 (+1) | 22 (-3) | 1996 | 64 (70) | 58 (63) | 61 (60) | 56 (59) |
| 1997 | 14 (-6) | 23 (-2) | 33 (0) | 21 (-3) | 1997 | 52 (58) | 61 (63) | 55 (55) | 53 (56) |
| 1998 | 6 (-7) | 18 (-4) | 40 (+1) | 19 (-3) | 1998 | 55 (62) | 61 (65) | 59 (58) | 55 (58) |
| 1999 | 4 (-7) | 20 (-4) | 4 (-6) | 22 (-1) | 1999 | 56 (63) | 61 (65) | 53 (59) | 58 (59) |
* Strike year - no testing done at grades 3 and 6.
Third Grade: Number of counties scoring the same as or less than Preston
County.
| Yr | 7 7 | 78 | 79 | 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 90 | 91 | 92 | 93 | 94 | 95 | 96 | 97 | 98 | 99 |
| # | 44 | 43 | 32 | 40 | 24 | 25 | 10 | 11 | 20 | 10 | 21 | 4 | 17 | * | 17 | 23 | 9 | 3 | 20 | 18 | 14 | 6 | 4 |
Third Grade: Variance of Preston County test scores from the state mean.
| Yr | 77 | 78 | 79 | 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 90 | 91 | 9 2 | 93 | 94 | 95 | 96 | 97 | 98 | 99 |
| Va | +5 | +4 | +2 | +4 | -2 | -2 | -8 | -6 | -3 | -8 | -3 | -11 | -3 | * | -3 | -2 | - 6 | -11 | -3 | -6 | -6 | -7 | -7 |
![]() |
|
Sixth Grade: Number of counties scoring the same as or less than Preston County.
| Yr | 77 | 78 | 79 | 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 90 | 91 | 92 | 93 | 94 | 95 | 96 | 97 | 98 | 99 |
| # | 25 | 30 | 27 | 9 | 15 | 14 | 14 | 15 | 13 | 14 | 21 | 16 | 20 | * | 9 | 18 | 25 | 6 | 10 | 16 | 23 | 18 | 20 |
Sixth Grade: Variance of Preston County test scores from the state mean.
| Yr | 77 | 78 | 79 | 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 90 | 91 | 92 | 93 | 94 | 95 | 96 | 97 | 98 | 99 |
| V | -3 | +1 | -2 | -11 | -6 | -7 | -7 | -6 | -6 | -6 | -2 | -6 | -2 | * | -6 | -3 | -3 | -9 | -5 | -5 | -2 | -4 | -4 |
| Yr | 77 | 78 | 79 | 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 90 | 91 | 92 | 93 | 94 | 95 | 96 | 97 | 98 | 99 |
| # | 19 td> | 27 | 17 | 25 | 26 | 24 | 25 | 13 | 21 | 20 | 17 | 18 | 11 | 15 | 5 | 26 | 18 | 43 | 42 | 30 | 33 | 40 | 4 |
NinthGrade: Variance of Preston County test scores from the state mean.
| Yr | 77 | 78 | 79 | 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 90 | 91 | 92 | 93 | 94 | 95 | 96 | 97 | 98 | 99 |
| V | -6 | -1 | -5 | -3 | -2 | -3 | -1 | -7 | -5 | -7 | -7 | -3 | -3 | -3 | - 11 | -1 | - 3 | +5 | +4 | +1 | 0 | +1 | -6 |
| Yr | 77 | 78 | 79 | 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 90 | 91 | 92 | 93 | 94 | 95 | 96 | 97 | 98 | 99 |
| # td> | 18 | 23 | 21 | 16 | 11 | 16 | 25 | 16 | 20 | 8 | 6 | 8 | 8 | 17 | 2 | 13 | 9 | 30 | 24 | 22 | 21 | 19 | 22 |
Eleventh Grade: Variance of Preston County test scores from the state mean.
| Yr | 77 | 78 | 79 | 80 | 81 | 82 | 83 | 84 | 85 | 86 | 87 | 88 | 89 | 90 | 91 | 92 | 93 | 94 | 95 | 96 | 97 | 98 | 99 |
| V | -4 | -4 | -4 | -4 | -8 | -6 | -4 | -5 | -7 | -11 | -15 | -10 | -10 | -2 | -14 | -7 | -8 | 0 | -2 | -3 | -3 | -3 | -1 |
American Federation of Teachers (2000, A) Introduction [Online] Making Standards Matter 1999, Available: http://www.aft.org//Edissues/standards99/intro.htm
American Federation of Teachers (2000, B) Judging State Standards Reforms [Online] Making Standards Matter 1999, Available: http://www.aft.org//Edissues/standards99/Judging.htm
American Federation of Teachers (2000, C) State by State Analysis, West Virginia [Online] Making Standards Matter 1999, Available: http://www.aft.org//Edissues/standards99/states/Westvirginia.htm
American Federation of Teachers (2000, D) Major Findings, Standards [Online] Making Standards Matter 1999, Available: http://www.aft.org//Edissues/standards99/findings.htm
American Federation of Teachers (2000, E) Table 2 [Online] Making Standards Matter 1999, Available: http://www.aft.org//Edissues/standards99/Table2.htm
Braden, Lawrence S., with Ralph A. Raimi (1998, March) State Mathematics Standards [Online] Fordham Foundation Standards Project, Vol. 2, No. 3. Available: http://www.edexcellence.net/standards/math.html
Cizek, Gregory J. (1998, October) Filling In the Blanks -- Putting Standardized Tests to the Test [Online] Fordham Report, Vol. 2, No. 11. Available: http://www.edexcellence.net/library/cizek.pdf
Education Week on the Web (1999) Academic Standards, Assessments, and Accountability [Online] Quality Counts 99, Vol. 18, No. 17. Available: http://www.edweek.org/sreports/qc99/states/indicators/in-t2.htm
Finn, Charles E., with Petrilli, Michael J. (2000, January) The State of State Standards 2000 [Online] Fordham Report, Available: http://www.edexcellence.net/library/soss2000/standards%202000.html
Munroe, Susan, with Terry Smith (February 1998) State Geography Standards [Online] Fordham Foundation Standards Project, Vol.2, No. 2. Available: http://www.edexcellence.net/standards/geogrph.html
Harcourt Brace Educational Measurement (2000, January) Stanford Nine - Technical Information [Online] Harcourt Brace Educational Measurment. Available: http://www.hbem.com/trophy/achvtest/techinf.htm
Harcourt Brace Educational Measurment (2000, January) Stanford Nine - Overview [Online] Harcourt Brace Educational Measurment. Available: http://www.hbem.com/trophy/achvtest/sat9view.htm
Lerner, Lawrence S. (1998, March) State Science Standards [Online] Fordham Foundation Standards Project, Vol. 2 No. 4. Available: http://www.edexcellence.net/standards/science.html
Phelps, Richard P. (1999, January) Why Testing Experts Hate Testing [Online] Fordham Report, Vol. 3, No 1. Available: http://www.edexcellence.net/library/phelps.htm
Ravitch, Diane. (1996, December) The State of Standards [Online] Network News & Views. Available: http://www.edexcellence.net/library/standard.html
Saxe, David Warren (1998, February) State History Standards [Online] Fordham Foundation Standards Project, Vol. 2, No. 1. Available: http://www.edexcellence.net/standards/history.html
West Virginia Board of Education (1999, January) Training Manual and Handbook for Education Performance Audits